Saturday, 28 December 2019

The true value of cloud-based AI isn’t what you think

Artificial intelligence is one of those concepts that was hot in the 80s, kind of went away, and now is red hot. Most point to AI’s new features and functions to explain its growing popularity, but it’s actually because public cloud computing has made it affordable. For a few hundred bucks a month you can have some pretty heavy-duty AI systems in place that would have cost millions 10 to 15 years ago.

However, integrating AI with applications, such as banking, medical, manufacturing, and other systems, is actually not where we’re finding the value of cloud-based AI. It’s perhaps the most misunderstood aspect of the value of AI—now, as well as in the future.

Those who sell AIops tools these days, especially where AI powers cloudops systems, understand this. Those who buy cloud-based technology, and currently are transferring core systems to public clouds, often don’t. Thus, the end-state cloudops systems and processes are not as valuable as they could be. What’s missing is AI and machine learning.

The points of value are clear to me, including:

The capability of self-healing. AI-based cloudops are capable of learning how things are fixed through matching problem patterns with solution patterns over time. After a while, they can do so automatically and better than humans can. This type of automation removes people from having to fix ongoing minor and major issues and increases reliability. As the cloudops knowledge engines become more experienced they get much better over time. 

Better defense of cloud-based data and applications. Security and AI have long been two concepts related to each other in theory, but often not understood by either AI or security experts. Indeed, AI can allow secops systems to become proactive and learn as they go what constitutes a breach attempt, and how to defend against it.

Opportunities for sharing knowledge. An operationally oriented AI system has a great deal of value but has to learn things over time, which is fundamental to cognitive computing. What if knowledge could be shared in real time? In essence you’d have a smart ops system from day one, benefiting from collective learning and knowledge. This is going to be a larger push in the near future.

The reality is that AI is one of those things that we tend to glamorize. Although we think of science fiction depictions of AI systems, their daily value is more pragmatic and less dramatic.

Edge computing best practices

Data processing, analytics, and storage increasingly are taking place at the network edge, close to where users and devices need access to the information. Not surprisingly, edge computing is becoming a key component of IT strategy at a growing number of organizations.
A recent report from Grand View Research predicted the global edge computing market will reach $3.24 billion by 2025, expanding at a “phenomenal” compound annual growth rate (CAGR) of 41% during the forecast period.
One of the biggest contributors to the rise of edge computing is the ongoing growth of the Internet of Things (IoT). The vast amounts of data created by IoT devices might cause delays and latency, Grand View says, and edge computing solutions can help enhance the data processing power, which further aids in avoiding delays. Data processing takes place closest to the source of the data, which makes it more feasible for business users to gain real-time insights from the IoT data devices are gathering.
Also helping to boost the edge market is the presence of high-connectivity networks in regions such as North America.
Edge computing is used in a variety of industries such as manufacturing, IT and telecommunications, and healthcare. The healthcare and life sciences sector is estimated to see the highest CAGR between 2017 and 2025, Grand View says, because of the storage capabilities and real-time computing offered by edge computing tools that enable the delivery of reliable healthcare services in lesser time. The decision-making process is enhanced as network failures and delays are avoided.
Supporting edge computing can be challenging for organizations because it involves a lot of moving parts and a change in thinking from the current IT environment dominated by data centers and cloud-based services. Here are some best practices to consider when building a strategy for the edge.

Create a long-term edge computing vision
Edge computing involves a lot of different components, and it requires building an infrastructure with the capacity and bandwidth to ingest, transform, analyze, and act on enormous volumes of data in real time, says Matt Kimball, senior analyst, data center, at global technology analyst and advisory firm Moor Insights & Strategy.

On the networking side alone, it means deploying connections from devices to the cloud and to data centers. While companies might have a desire to ramp up their edge infrastructure as soon as possible in order to support IoT and other remote computing efforts, all of this is not going to happen overnight.

“Think big, act small – meaning map out the long-term vision for edge deployments” but don’t be in a rush to implement edge technologies all over the place right away, Kimball says.

The speed at which edge technologies can be rolled out varies based on industry, deployment model, and other factors, Kimball says. But given the rapid pace of innovation in the edge market, “it’s easy to get swayed by technology that is very cutting edge but maybe doesn’t contribute to an organization’s needs,” he says. “So, map out the vision and execute in small steps that are manageable.”

As part of planning the edge strategy, develop a business plan that will help secure a budget.

“Most organizations say that cost is a top concern – even above data security,” says Jennifer Cooke, research director, datacenter trends and strategies, at research firm International Data Corp. (IDC). “Obtaining budget is difficult and requires a solid plan for how edge IT is going to drive value for the business. Because cost is such a high concern, pay-per-use offerings will become increasingly sought after.”

Address cultural issues: Edge computing involves IT and operations
Putting processing power at the edge involves not just IT, but operational technology (OT) as well, and these are two separate organizations with different cultures and personalities, Kimball says.

“The OT folks are different,” Kimball says. “These are equally technical folks – in many cases, more technical – but focused on things like making sure a water treatment plant is operating properly through Supervisory Control And Data Acquisition (SCADA) process control systems.”

These are the systems that make sure valves open at certain times, for example, and environmental conditions are within specified ranges, Kimball says. It’s “IT for the industrial environment. So, processes, tools, and the kinds of technologies deployed and managed are different between the two organizations,” he says.

Bridging the two into one group that manages from the core data center out to the field or shop floor is a big challenge, but one that needs to be addressed. “Culture matters. If an organization can’t converge IT and OT at the organizational level, the convergence of technology will fall short,” Kimball says.

IT and operational teams must be equal partners, says Daniel Newman, principal analyst and founding partner at Futurum Research, a research and analyst firm. While edge computing is mainly driven by operational teams today, IT teams are responsible for managing these systems in more than two-thirds of enterprises, Newman notes in a 2018 study.

For edge computing to grow and increase its overall business value, IT must become more of a strategic collaborator with operational teams. It's not only managing edge computing resources, but also being involved in the long-term strategy, budgeting, and sourcing to ensure these systems are in line with larger, enterprise-wide strategic and transformational initiatives, Newman says.

Find partners to help with edge computing technology deployments
Many organizations say they lack the internal skills to support IT at the edge, Cooke says. “For this reason, we believe that many edge buildouts will happen through partnerships with collocation providers as well as vertical industry solutions through integrators,” she says.

IDC finds that many organizations are looking for a “one-stop solution” for delivering IT service at the edge. “Systems integrators with vertical market expertise will be sought after to help organizations along their edge journey,” Cooke says.

For example, a retail business might want to implement a solution, but is not interested in putting all the pieces together itself. Or it might want to derive insights from data on site at the edge and build the infrastructure to accomplish this, which can be complex.

“Beyond the software tools to analyze data, the solution needs connectivity as well as compute and storage infrastructure,” Cooke says. “Considerations such as controlling the physical environment [including temperature and humidity], physical security, and protection of equipment are important considerations as well.” An expert partner can help with all of this.

Don’t forget about edge computing security
As with any other aspect of IT, edge computing comes with its own set of cyber security threats and vulnerabilities. The InfoSec Institute, an organization that provides training for information security and IT professionals, in August 2018 noted a number of security issues related to the edge.

These risks include weak passwords for access to devices, which makes them easy targets for attackers; insecure communications, with data collected and transmitted by devices largely unencrypted and unauthenticated; physical security risks, because security is commonly acknowledged to be a low priority in the development of IoT and other edge devices; and poor service visibility, with security teams unaware of the services running on certain devices.

“It’s a top-of-mind issue,” Kimball says. “Not just security on the [device]. But security of the data that’s transmitted, security of the servers that sit on the edge and perform the data transformation and analysis, and security of the data as it travels from the edge to the cloud to the core data center.”

InfoSec Institute recommends actions such as expanding corporate password policies to testing and enforcing strong passwords on edge devices; encrypting the data sent by devices or using virtual private networking (VPN) to encrypt traffic in transit between devices and its destination; taking steps to provide devices with physical security protections; and identifying and securing services provided by devices, including analysis of network logs to identify traffic from unknown devices within an organization’s network perimeter.

Companies need to have a security strategy in place to properly secure both IoT and edge computing systems, Newman says, from a physical and logical perspective. That includes data that is processed and remains at the edge.

Prepare for rapid IoT growth: Edge computing scalabilty required
For some sectors such as manufacturing, healthcare, utilities, and municipal government, the growth of the IoT will likely be dramatic over the coming years in terms of the number of connected devices and the volumes of data gathered and processed, so companies will need to build scalability into their edge computing plans.

“Not only are we anticipating an increase in the overall percentage of data generated at the edge being processed at the edge, but we see an ongoing increase in the volume of data being created throughout the enterprise, and particularly in the intelligent edge of the future,” according to a 2018 Futurum report on the edge.

As edge computing expands to support operational IoT devices and data, the implementation of edge computing will make it easier to derive value from new IoT-based data sources, the report says. Without planning for the scalability of storage, data analytics, network connectivity, and other functions, companies will not be able to reap the full benefits of the edge or IoT.

Sunday, 22 December 2019

NVIDIA Brings AI To Health Care While Protecting Patient Data

Health care has been one of the early adopters of artificial intelligence (AI), because the technology has the ability to find needles in haystacks of data much faster than people can. This increase in speed can often save lives; time is of the utmost importance in this industry. Also, AI systems can often find things that are not apparent to even the most skilled clinician.

As an example, ZK Research recently interviewed a data scientist at a leading health-care institution in the Boston area where the radiology department used AI to inspect MRIs. AI systems can spot brain bleeds and other issues that are small and imperceptible to the human eye. This enables doctors to spend more time treating patients and less time diagnosing the problem.

Patient data privacy limits the use of AI in health care
One of the biggest factors holding AI in health care back is enabling machine learning and AI frameworks to access the massive volumes of patient data without violating strict privacy violations. At the recent annual Radiological Society of North America (RSNA) conference, NVIDA demonstrated a solution that can get over this hurdle.

NVIDIA introduced its Clara Federated Learning, which uses a distributed, collaborative learning technique that keeps patient data inside the walls of a health-care provider instead of pulling it into a cloud service. This is accomplished by running Clara Federated Learning on the recently announced NVIDIA EGX intelligent edge computing platform.

NVIDIA, the industry’s GPU market leader, has been instrumental in bringing machine learning and AI into more verticals by building systems to address specific industry needs, and the new health-care use case is a great example. Clara Federated Learning (FL) is a reference application for distributed, collaborative AI model training that ensures privacy for patient information. The workload on an edge server from any number of NVIDIA partners can train systems globally by sharing labeled data with other hospitals. The larger data set creates more accurate models and significantly reduces the time clinicians need to spend labeling data.

Clara Federated Learning speeds up AI training while protecting patient data
The system has been packaged into a Helm chart to make it easier to deploy on Kubernetes Infrastructure. The NVIDIA EGX Edge compute node securely provisions the federated server and the collaborating clients, delivering the full stack of what is needed to run a federated learning project, including containers and the initial AI model.

The uniqueness of the systems is that it uses distributed training data across multiple health-care institutions to developer better AI models without sharing patient data. Each hospital can label its own patient data using the NVIDIA Clara AI-Assisted Annotation SDK, which has been integrated into a number of medical viewers such as 3D slicer, MITK, Fovia and Philips Intellispace Discovery. The pre-trained models and transfer learning techniques dramatically speeds up the learning time. Some hospitals have told ZK Research that processes that used to take hours can now be done in minutes, providing a huge boost to the organization.

The privacy is ensured as the training results are shared back to the federated learning server over a secure link. Also, the system only shares model information and not patient records, protecting sensitive information. The process runs iteratively until the AI Model reaches a predetermined level of accuracy. The distributed model accelerates learning through the use of a larger data set but keeps patient information secure and private.

United States, United Kingdom are leading the charge
The system is being developed in conjunction with a number of leading health-care organizations in the U.S. and UK. This includes the American College of Radiology, Massachusetts General Hospital, Brigham and Women’s Hospital Center for Clinical Data Science and UCLA Health, with the goal of developing personalized AI for their doctors and patients.

In the UK, NVIDIA has partnered with King’s College London and Owkin to create a federated learning platform for the National Health Service. The Owkin Connect platform running on Clara FL enables algorithms to be used in multiple hospitals. In this case, blockchain is being used a distributed ledger to capture and trace data used for modeling.

AI will change the world in ways never imagined, and there is no better use case than in health care. Massive amounts of data are created by the health-care system today, but there is no mechanism to connect the dots to find key insights. NVIDIA’s Clara FL allows for those dots to be connected--and across historically disparate islands without compromising patient privacy.

Kubernetes meets the real world

Ever since it emerged out of the halls of Google five years ago, Kubernetes has quickly become one of the hot technologies of the decade. Simply put, Kubernetes is now the undisputed platform of choice for composing and running applications comprised of microservices – small, independently deployable services that run in containers and work together to function as a larger application that can be ported across various types of infrastructure.

Kubernetes is an orchestration tool, which in this case means it enables developers to view, coordinate, and manage containerized workloads and services with the goal of running resilient distributed systems. According to the latest figures from the Cloud Native Computing Foundation (CNCF), published in August 2018, 40 percent of respondents from enterprise companies (over 5000 firms) are already running Kubernetes in production.

While that’s good progress for the open source project, it’s important to note that the vast majority of these organizations are running only a handful of applications with Kubernetes as they get to grips with the technology. But the direction of travel is clear: Container-based microservices applications are the future and Kubernetes is their platform. That’s why the big three cloud providers have all launched managed versions of Kubernetes – and Cisco, HPE, IBM/Red Hat, Microsoft, VMware/Pivotal, and others have incorporated Kubernetes into their core software offerings.

Kubernetes is enabling enterprises of all sizes to improve their developer velocity, nimbly deploy and scale applications, and modernize their technology stacks. For example, the online retailer Ocado, which has been delivering fresh groceries to UK households since 2000, has built its own technology platform to manage logistics and warehouses. In 2017, the company decided to start migrating its Docker containers to Kubernetes, taking its first application into production in the summer of 2017 on its own private cloud.

The big benefits of this shift for Ocado and others have been much quicker time-to-market and more efficient use of computing resources. At the same time, Kubernetes adopters also tend to cite the same drawback: The learning curve is steep, and although the technology makes life easier for developers in the long run, it doesn’t make life less complex.

Here are some examples of large global companies running Kubernetes in production, how they got there, and what they have learned along the way.

Bloomberg reaps the benefits of early adoption
Financial data specialist Bloomberg turned to Kubernetes in 2015, when the tool was still in alpha, before moving into production in 2017 once the necessary continuous integration, monitoring, and testing was proved out.

Bloomberg processes hundreds of billions of financial data points every day, with 14,000 different applications powering its ubiquitous Terminal product alone. The IT organization wanted to boost the speed at which it could bring new applications and services to users and free up developers from operational tasks.

After assessing various orchestration platforms, such as Cloud Foundry, Mesosphere Marathon, and various Docker offerings, Bloomberg opted for Kubernetes because it “had a good foundation and it was clear they were confronting the right problems. You could see a vision and roadmap as to how it would evolve that were aligned with what we were thinking,” explains Andrey Rybka, head of compute infrastructure in the Office of the CTO at Bloomberg.

Over time Bloomberg has worked on a homegrown platform-as-a-service layer on top of Kubernetes to give developers the right level of abstraction to work effectively with the technology. This self-service web portal is essentially a command-line interface and REST API which integrates with a Git-based version control system, CI build system, and central artifact repository.

One of the key goals for Bloomberg was to make better use of existing hardware investments using the autoscaling capabilities of Kubernetes, along with the ability to self-provision and flex virtual compute, networking, and storage without having to issue tickets. “With Kubernetes, we’re able to very efficiently use our hardware to the point where we can get close to 90 to 95 percent utilization rates” at times of peak demand, Rybka said as part of a CNCF case study. Much of that efficiency comes from the ability to constrain resources for a given workload, so it doesn’t starve other workloads.

As is the case with most enterprises adopting Kubernetes in production, the main challenges arose around the use of YAML to write manifests, which specify how Kubernetes allocates resources. “These are powerful concepts in Kubernetes that require a steep learning curve,” Rybka said.

As Steven Bower, Bloomberg’s data and analytics infrastructure lead, put it: “Kubernetes makes a lot of things easier but not necessarily simpler.”

As a result, Bloomberg started with basic manifests, limited to a small subset of criteria from which developers could scale up their usage as they got more comfortable with the technology, as well as running plenty of internal training programs.

“We have a lot of existing infrastructure and there is zero chance that will miraculously move to Kubernetes off big iron [mainframes],” he said. Instead the orchestration platform is being targeted at web-based applications and net-new systems. In the data and analytics Infrastructure team, where Bower works, the initial approach was to stand up a new data science compute platform for the machine learning engineers to run complex workloads using tools like Spark and TensorFlow.

As his parting piece of advice, Rybka talked about the importance of building expertise. “You really have to have an expert team that is in touch with upstream Kubernetes and the CNCF and the whole ecosystem to have that in-house knowledge. You can’t just rely on a vendor and need to understand all the complexities around this,” he said.

News UK taps Kubernetes to scale on demand
The UK arm of media giant News Corp has been dabbling with Kubernetes since 2017, moving from their own custom Kubernetes clusters to the managed Elastic Kubernetes Service (EKS) from Amazon Web Services in 2018. This makes up part of a stack that also includes a bunch of AWS services, including Elastic Container Service, the Fargate compute engine, AWS Batch, and Elastic Beanstalk.

The first in-production application to be moved into this managed Kubernetes environment was a legacy Java system for access control and user login. Once the environment proved robust enough, the organization began steadily identifying and migrating other applications.

Speaking at monitoring specialist New Relic’s London Futurestack event earlier this year, Marcin Cuber, a former cloud devops engineer at News UK, said that “operationally, this simplifies what we have to maintain and monitor. On top of that we have EKS in its own isolated VPC, allowing us to specify our own security groups and network access control lists.”

The key goal for News UK was to better be able to scale up its environment around breaking news events and unpredictable reader volumes. “If there is breaking news, for example, we want every reader to be able to gather real-time updates worldwide and of course, to have a flawless experience,” Cuber said.

Where Kubernetes differs from VM autoscaling comes down to speed. “VMs take long to spin up and when there is a spike of traffic, it is not fast enough to bring new capacity into the AutoScalingGroup,” Cuber said. “Docker containers running in Kubernetes are smaller and lightweight, therefore allowing us to scale in a matter of a few seconds rather than minutes.”

Cuber also had some advice for any organizations looking to adopt Docker and Kubernetes. First was to make your Docker images as small as possible and to focus on running stateless applications with Kubernetes. “This will improve your scalability and portability,” he said.

Next is to run health checks for your applications and to use YAML to deploy anything. “This way you can utilize temporary credentials that will expire soon after your deployment and you never have to worry about static located credentials,” he added.

News UK also wanted to cut costs by pairing EKS clusters with AWS spot instances – where AWS sells spare compute capacity at a discount rate but can also reclaim that capacity at any time.

“There’s a huge advantage of using spot instances; we are making around 70 percent savings compared to on-demand pricing,” Cuber said. As a way to circumvent the issue of nodes being taken away, the engineers set up an AWS Lambda function that detects the termination signal from AWS and automatically drains the nodes due to be affected.

Amadeus drinks the Kubernetes Kool-Aid
Spanish travel tech giant Amadeus has been working with Kubernetes as far back as version 0.7 five years ago. In the ensuing two years the company was keen to see things like monitoring, alerting, and the wider ecosystem mature before committing any business-critical applications to Kubernetes. The company now feels it made the right bet.

Amadeus is one of the big three global distribution systems that enable travel agents and metasearch engines like Expedia and Kayak to sell flight, hotel room, and rental car bookings. Late in 2016 the organization started to move its first application – for airline availability – to Kubernetes in production, hand in hand with Red Hat’s OpenShift platform. The plan was actually to move a hotel reservation application first, but as that project bloated, the airline availability application, which was built for Linux and needed to be moved to the public cloud to better serve its airline clients’ growing demands for lower latency, made it to production faster.

“The good thing we had from the start is all our apps are on Linux, so they are container-friendly from the start,” Etienne said. “Of course they were monolithic, but it was really more about how to move existing apps to containers and then Kubernetes, so the position was pretty straightforward.”

Shifting to Kubernetes fit with a broader business goal for Amadeus to shift from on-premises deployments to the public cloud, predominantly with its partner Google Cloud, so that it could better scale to meet seasonal demand and cut down on over-provisioning infrastructure costs.

In terms of challenges, Amadeus is a strong engineering organization, so once some training had been completed the technical challenges paled into insignificance compared to the cultural shift that tools like Kubernetes required from the organization.

“One of the main challenges is shifting mindset in terms of what it means for developers,” Etienne said. “They used to think about the machine the application runs on and now you forget about the machine and everything is configuration driven with YAML files everywhere.”

“Everyone was already getting ready for containers, so the biggest shift was operating apps in an agnostic way,” he added.

The overall goal for Amadeus is to move all production workloads to run on a single operating model with Kubernetes, and the organization is around 10 to 15 percent of the way there so far. “As with any strategy, if we reach that goal, it is too early to say,” Sebastien Pellise, director of platform solution management at Amadeus said.

Another, softer benefit of adopting tools like Kubernetes is with recruiting and retaining talent, because “working on these type of things is so much more sexy to advanced engineers than working on a mainframe,” said Dietmar Fauser, former SVP of technology platforms and engineering at Amadeus, in an interview earlier this year.

Gearing up for a Kubernetes future
One of the more interesting aspects of these various case studies is their consistency. Regardless of industry – be it financial services, media, retail, or technology – organizations of all sizes are grappling with a sea change in the way software is built and deployed in small, discrete, loosely coupled chunks of functionality.

There are also consistencies among challenges and benefits. All of these organizations are compelled to enact sometimes painful cultural change and face significant recruitment challenges as they compete for talent with the likes of Google and Facebook. All of these organizations are also starting to speed up their development cycles, reduce costs and downtime, and deliver more value more frequently for their customers.

At this point, it’s not an exaggeration to say that any organization that fails to get up to speed with containers and Kubernetes will struggle to keep up in our new, accelerated, software-driven world.

What’s next for the cloud data warehouse

If multicloud is the strategy for data warehousing today, then cross-cloud is its vision for tomorrow. This prediction comes from a universal need to seamlessly move and exchange data across different regions within the same cloud provider and even across different clouds.

Circumstances such as geographic location and the incompatibility of cloud platforms hinder the goal of globally accessible data. As a result, companies struggle to securely share data across an enterprise (and beyond), to manage latency between business locations, and to bring together silos of data that result from using multiple clouds.

Change is on the horizon. Soon, all organizational data will know no borders. No matter where they store data or which cloud providers they use, companies will access all of their data from anywhere and everywhere, if they choose to do so. 

Data limitations today
Although the benefits of the cloud are well documented, cloud service providers have yet to deliver on its full promise due to two significant factors:

  • Geography: The nature of cloud delivery requires companies to use regional clouds. The reason: Services work best when users are in close proximity. Anyone who has attempted to query or share data that is stored in distant clouds knows that latency is a problem. Therefore, businesses often create individual accounts by region. These accounts become the physical place where data is stored and queried by local users. This setup is less than ideal for companies in multiple regions because they can’t easily share data across the organization.
  • Proprietary APIs: The major cloud platforms (Amazon Web Services, Microsoft Azure, Google Cloud Platform) are all built with proprietary APIs. As a result, companies with a multi-cloud strategy end up spreading their data across cloud platforms. Without an easy way to share, data once again becomes siloed—this time in cloud platforms rather than in on-premises servers. 

The challenges that arise from this present-day reality include: 
  • Inability to analyze all data: Data is created and stored locally, which is inadequate for multinational organizations with a global presence. Although local systems may work fine, it’s a complex process to centralize all relevant data required for answering important business questions.
  • Lack of connection to other systems: Connecting data centers across regions, countries, and continents requires intricate infrastructure setups and continuous maintenance to ensure secure and seamless connections. This work is complicated and expensive, especially when it requires moving large volumes of data across data centers that are far apart. As a result, many data systems are not connected to each other, regardless of whether they live in the cloud. 
  • Complicated replication processes: As a general rule, replication of data is an extremely distributed process, which makes it expensive to set up and complicated to manage. Only high-end companies tend to have the resources and manpower to tackle it.
  • Concerns about vendor lock-in: Much like 40 years ago when organizations didn’t want to be locked into a particular hardware vendor, companies are now concerned about lock-in with a single cloud provider. Organizations want the freedom to move their data and applications in order to benefit from new services or better pricing. Data portability becomes a daunting proposition, especially when companies have multiple petabytes of data to move. 
The benefits of global data
The vision has always been an interconnected world of data on one unified platform. This prospect will become a reality when we build bridges between all regional instances and cloud providers so that data can move freely. To achieve this future state, we need cross-cloud capabilities.

In my mind, cross-cloud has two requirements. The first is the creation of a cloud-agnostic layer, which provides a unified data management platform on top of each cloud region, built by any cloud provider. The second requirement is interconnecting these regions through a high-throughput communication “mesh” that allows data to move anywhere—between regions, within and across continents, and even across regions managed by different cloud providers. 

In short, regardless of where data resides or what proprietary cloud system is used, the cloud-agnostic layer and mesh can run on any cloud system. The result is the removal of all barriers to data and the creation of what I like to call a “virtual multicloud global data center” where data is easily and inexpensively accessible no matter where it is stored. 

With this type of analytics data platform, businesses can:
  • Bridge geographic regions and move data with ease. The same code can be run on top of different clouds to perform global analytics. 
  • Run on any cloud platform they want and be truly multicloud. The threat of being locked into a single provider disappears.  
  • Use replication to make latency a challenge of the past by solving for proximity and completeness of data. Rather than use data that’s stored in a different region or on a different continent, organizations can replicate remote data and combine it with local data to form a single, centralized place to access all global data. 
  • Store two or more copies of data using modern replication, which is crucial for failover and business continuity. However, organizations will be able to build high-availability systems at a fraction of the cost of legacy replication systems.  
Achieve true data-driven decision-making
We live in a global business world. Boundaries are being shattered, and those boundaries must include cloud barriers. Organizations need global data to achieve truly informed data-driven decision-making. 

Cross-cloud delivers on the promise of global data and empowers organizations to fully execute on multicloud strategies. By enabling data to move freely and securely, and be consolidated into a single source of truth, organizations will become truly global.  

IBM Cloud Pak for Security: Full Protection Wherever Data Resides

The routes that organizations are taking to cloud computing are pretty well set. Rather than flocking to individual public clouds as evangelists once envisioned, enterprises are instead maintaining data on premises in various systems and private clouds while also engaging with multiple public cloud platforms.

A larger question remains in how and how well valuable data and application assets can be protected when they are widely dispersed in these hybrid multi-cloud environments. That issue is especially pointed considering the increasing frequency and sophistication of attacks by cyber-criminal and rogue states.

Fortunately for businesses, security vendors including IBM are pushing forward individually and in partnerships to address these challenges. IBM’s recently announced Cloud Pak for Security incorporates its own formidable assets and also integrates new open source security technologies developed by both the company and its strategic partners. The new platform is part of a family of six IBM Cloud Paks, one being IBM Cloud Pak for Data, a platform that enables customers to comprehensively explore, manage, analyze and govern myriad data assets across their organizations.

Multi-cloud security challenges
What is the primary issue impacting multi-cloud security? Data and application fragmentation have to be at the top of the list. The more that companies work with cloud technologies, the more comfortable they become. Since no single cloud platform provides everything that organizations need, multi-cloud has become the de facto approach for businesses. In fact, IBM’s 2018 Institute for Business Value study found that while 76% of respondents were already using between two and fifteen hybrid clouds, 98% said they will be using multiple hybrid clouds within three years.

Hybrid cloud engagements tend to result in applications and data being spread across private and public clouds, as well as in on premises IT resources. Keeping track of these assets is hard enough but protecting them is even more difficult. Why so? Because of the breathtaking variety of security applications, tools and services utilized by owners and cloud providers. As a result, security teams must devise complex integrations requiring them to switch back and forth between management screens and various point products.

In a recent IBM Security-sponsored SANS Institute survey, over half of security team respondents noted that they struggle to integrate data with disparate security and analytics tools, and to combine data resources spread across hybrid cloud environments in order to spot advanced threats. Don’t be surprised if this situation sounds familiar. A decade ago, many businesses struggled with data that became isolated within departments and work groups.

These so-called information “siloes” were headaches from the perspective of management and governance requirements, and often impacted the core value of data resources and investments. It doesn’t require a leap of imagination to see how, absent proper management, hybrid cloud environments could spawn new generations of data siloes located well-outside and beyond the control of business owners. Considering the ever-increasing number of cyber threats and data-seeking bad actors, preventing such outcomes is a topline business goal.

IBM Cloud Pak for Security
So, what is IBM doing to positively impact these issues? According to the company, its new Cloud Pak for Security can connect with any security tool, any public or private cloud and any on premises IT system, enabling data to be scanned and analyzed for cyber threats and security vulnerabilities without moving it from its original source.

The platform can search and translate security data from a variety of resources, collecting insights across multi-cloud environments. IBM also notes that the platform is extensible, enabling new tools and applications to be added to it over time so that Cloud Pak for Security can evolve to address new security threats.

Initial capabilities include:Multi-cloud security challenges
What is the primary issue impacting multi-cloud security? Data and application fragmentation have to be at the top of the list. The more that companies work with cloud technologies, the more comfortable they become. Since no single cloud platform provides everything that organizations need, multi-cloud has become the de facto approach for businesses. In fact, IBM’s 2018 Institute for Business Value study found that while 76% of respondents were already using between two and fifteen hybrid clouds, 98% said they will be using multiple hybrid clouds within three years.

Hybrid cloud engagements tend to result in applications and data being spread across private and public clouds, as well as in on premises IT resources. Keeping track of these assets is hard enough but protecting them is even more difficult. Why so? Because of the breathtaking variety of security applications, tools and services utilized by owners and cloud providers. As a result, security teams must devise complex integrations requiring them to switch back and forth between management screens and various point products.

In a recent IBM Security-sponsored SANS Institute survey, over half of security team respondents noted that they struggle to integrate data with disparate security and analytics tools, and to combine data resources spread across hybrid cloud environments in order to spot advanced threats. Don’t be surprised if this situation sounds familiar. A decade ago, many businesses struggled with data that became isolated within departments and work groups.

These so-called information “siloes” were headaches from the perspective of management and governance requirements, and often impacted the core value of data resources and investments. It doesn’t require a leap of imagination to see how, absent proper management, hybrid cloud environments could spawn new generations of data siloes located well-outside and beyond the control of business owners. Considering the ever-increasing number of cyber threats and data-seeking bad actors, preventing such outcomes is a topline business goal.

IBM Cloud Pak for Security
So, what is IBM doing to positively impact these issues? According to the company, its new Cloud Pak for Security can connect with any security tool, any public or private cloud and any on premises IT system, enabling data to be scanned and analyzed for cyber threats and security vulnerabilities without moving it from its original source.

The platform can search and translate security data from a variety of resources, collecting insights across multi-cloud environments. IBM also notes that the platform is extensible, enabling new tools and applications to be added to it over time so that Cloud Pak for Security can evolve to address new security threats.

Initial capabilities include:

  • Since it is comprised of containerized software pre-integrated with the Red Hat OpenShift Kubernetes platform, Cloud Pak for Security installs easily in any on-premises, private cloud or public cloud environment
  • Rather than transferring offsite data for security analysis, a time-consuming and costly process that many conventional solutions require, Cloud Pak for Security smoothly connects to data sources where they reside, detecting hidden threats and helping customers make better-informed risk-based decisions.
  • Rather than manually searching for threat indicators, like malware signatures and malicious IP addresses within each individual environment, a Data Explorer application allows analysts to streamline the hunt for threats across security tools and clouds. According to IBM, Cloud Pak for Security is the first tool that allows this type of search without moving data into the platform for analysis.
  • Cloud Pak for Security allows companies to orchestrate and automate their response to hundreds of common security scenarios, guiding users through the process and providing quick access to security data and tools. IBM’s Security Orchestration, Automation and Response capability also integrates with Red Hat Ansible for additional automation playbooks. This allows security teams to address threats and prioritize their time more effectively, a crucial point, since IBM Security estimates that enterprise security teams manage an average of 200,000 potential security events per day and, during that process, coordinate responses across dozens of tools and applications.

It’s worth noting that IBM collaborated with numerous clients and service providers during the Cloud Pak for Security design process in order to address critical security interoperability challenges. The new platform includes connectors supporting pre-built integrations with security tools from IBM, BigFix, Carbon Black, Elastic, Splunk and Tenable, as well as public cloud providers including IBM Cloud, AWS and Microsoft Azure. Since Cloud Pak for Security is built on open standards, it can connect additional security tools and data from across a customer’s infrastructure.

Final analysis
The IT industry is rife with sometimes intriguing commercial products that have little, if any practical application. These “solutions in search of a problem” are often designed in obverse fashion, betraying developers’ lack of insight into customers’ actual needs and requirements. In sharp contrast, with Cloud Pak for Security IBM has devised and designed a solution to address numerous specific problems that plague enterprise customers, as well as critical issues impinging the adoption and benefits of hybrid multi-cloud computing.

This won’t come as a surprise to anyone who has paid attention to the sizable investments and commitments IBM has made to open source and open standards. The company also keenly understands the value of collaborating with innovative partners, an especially critical point in the rapidly evolving and fundamentally heterogeneous world of hybrid multi-cloud computing.

Overall, the IBM Cloud Pak for Security platform should provide substantial benefits to its existing customers and is an offering that prospective IBM clients would do well to consider.



Tuesday, 17 December 2019

Five Positive Use Cases for Facial Recognition

While negative headlines around facial recognition tend to dominate the media landscape, positive impacts of facial recognition technology are being created on a daily basis -- despite these stories often being overshadowed by the negative noise. It is the mission of industry leaders in computer vision, biometric and facial recognition technologies to help the public see how this technology can solve a range of human problems. 

In fact, the industry as a whole is tasked with advocating for clear and sensible regulation, all while applying guiding principles to the design, development and distribution of the technologies they are pursuing. AI solutions are solving real-world problems, with a special focus on deploying this technology for good. 

In this eWEEK Data Points article, Dan Grimm, VP of Computer Vision and GM of SAFR, a RealNetworks company, uses his own industry information to describe five use cases on the socially beneficial side of facial recognition.

Data Point No. 1: Facial Recognition for School Safety 
With school security a top priority for parents, teachers and communities, ensuring a safe space is vitally important. It can be difficult to always monitor who’s coming and going, and school administrators need a way to streamline secure entry onto campus property. 

K-12 schools are using facial recognition for secure access--a system that requires a person to be an authorized individual (such as teachers and staff)--in order to gain access to the building. This not only helps keep students safe but also makes it easier for parents and faculty staff to enter school grounds during non-peak hours. 

Facial recognition is being used to alert staff when threats, concerns or strangers are present on school grounds. Any number of security responses can be configured for common if-this-then-that scenarios, including initiating building lockdowns and notifying law enforcement, when needed. 

Data Point No. 2: Facial Recognition for Health Care 
As our population grows, so does the need for more efficient healthcare. Plain and simple, there simply isn’t time in busy physician offices for mistakes or delays. Facial recognition is revolutionizing the healthcare industry, whether it be AI-powered screenings and diagnoses, or utilizing secure access.

Healthcare professionals are using facial recognition technologies in some patient screening procedures. For example, the technology is being used to identify changes to facial features over time, which in some cases represent symptoms of illnesses that might otherwise require extensive tests to diagnose--or worse, go unnoticed.

Data Point No. 3: Facial Recognition for Disaster Response and Recovery 
When first responders arrive on the scene of an emergency, they’re looked to as calming forces among the chaos. With every moment critical, time is precious as each second could spell the difference between favorable and unfavorable outcomes.

A first responder outfitted with a facial recognition bodycam could quickly scan a disaster site for matches to a database of victims. This piece of technology has the ability to immediately know the names of victims, which enables first responders to deliver more efficient care, transform outcomes and deliver faster peace of mind to family members awaiting news of their loved ones. 

In critical-care situations, knowing the blood types of each resident in a disaster zone when identified by first responders could in turn, save more lives. This application would require the victims' family members to provide photos and blood type information so the emergency responders could scan the disaster area for the blood types needed. 

Data Point No. 4: Facial Recognition for Assisting the Blind
In our media-driven world, it can be challenging for blind persons to gain access to information. Finding ways to translate visual information into aural cues to make data more easily accessible has the potential to be life changing. 

Facial recognition apps highly tuned to facial expressions help blind persons read body language; specifically, an app equipped with this technology would enable a person to “see” a smile by facing their mobile phones outward. When someone around them is smiling, the phone vibrates--a transformative experience for someone who has not only never seen a smile but also has to work extra hard to detect with other senses as to whether the people around him/her are smiling. 

Another mobile app is geared toward achieving greater situational awareness for the blind, announcing physical obstacles like a chair or a dog along the way, as well as reading exit signs and currency values when shopping. This not only enables blind persons to navigate their surroundings more efficiently, but also gives them greater control and confidence to go about their everyday life without those accustomed hurdles. 

Data Point No. 5: Facial Recognition for Missing Persons 
From runaways to victims of abduction and child trafficking, it’s believed that tens of thousands of kids go missing every year. This statistic is unacceptable, especially in spite of our digitally connected world. It is up to us, as technology entrepreneurs, to find new ways to work with local authorities to protect our most vulnerable demographic. 

Facial recognition is addressing the missing persons crisis in India. In New Delhi, police reportedly traced nearly 3,000 missing children within four days of kickstarting a new facial recognition system. Using a custom database, the system matched previous images of missing kids with about 45,000 current images of kids around the city. 

Because children tend to change in appearance significantly as they mature, facial recognition technology has also been used with images of missing children to identify them years -- or even decades -- later. Parents and guardians provide local authorities with the last known photos they have of their children, and police match those against a missing persons database. Police can then search local shelters, homeless encampments and abandoned homes with this advanced technology, giving parents hope long after investigations have seemingly stalled.

Wednesday, 11 December 2019

Amazon joins the quantum computing crowd with Braket testbed

Amazon’s initial foray into the heavily hyped world of quantum computing is a virtual sandbox in which companies can test potential quantum-enabled applications and generally get to grips with the new technology, the company announced Monday.

The product is named Braket, after a system of notation used in quantum physics. The idea, according to Amazon, is to democratize access to quantum computing in a small way. Most organizations aren’t going to own their own quantum computers for the foreseeable future; they’re impractically expensive and require a huge amount of infrastructure even for the limited proof-of-concept models at the current cutting-edge.

Hence, providing cloud-based access to three of those proofs-of-concept – the D-Wave 2000Q, Rigetti 16Q , Aspen-4 and IonQ linear ion trap – offers businesses the opportunity to learn firsthand about the way qubits work and how the basic building blocks of quantum programming might look. Braket will let users work remotely with those quantum computers or try out quantum algorithms in a classically driven simulated environment.

“Our goal is to make sure you know enough about quantum computing to start looking for some appropriate use cases and conducting some tests and experiments,” said chief AWS evangelist Jeff Barr in a blog post.

To help guide those efforts, Amazon also announced that it would form the AWS Center for Quantum Computing in partnership with Cal Tech. The idea here seems to be to create a center of excellence for research into both how quantum computers can be put to use and how they can be manufactured on a slightly larger scale. Furthermore, the new Amazon Quantum Solutions Lab would allow for a collaborative space in which companies can partner to share newfound expertise in quantum computing, as well as workshops and brainstorming sessions for education on quantum topics.

“Quantum computing is rapidly evolving, but the limited scale of the quantum hardware available today, fragmented development tools, and general shortage of quantum expertise, make it difficult to build near-term quantum applications,” said Amazon in a statement.

Quantum computing technology is still in the very early stages of development – something like classical computing in the days of the Bletchley Park codebreaking machines, or ENIAC at the latest. Yet major tech companies have been eager to grab headlines in the field. Google boasted in October of having achieved quantum supremacy, the ability to solve a problem with a quantum computer more quickly than with a classical one.

This sort of cloud-based quantum testbed isn’t a wholly new idea. IBM has offered its Q Experience platform since 2016, and the company recently announced that more than 10 million experiments have been run there to date. And Amazon’s cloud rival Microsoft announced its Azure Quantum service just last month, offering a similar combination of cloud access, quantum programming tools, and remote access to prototype quantum computers.

Monday, 9 December 2019

What is Data Flow Testing? Application, Examples and Strategies

Data Flow Testing is a specific strategy of software testing that focuses on data variables and their values. It makes use of the control flow graph. When it comes to categorization Data flow testing will can be considered as a type of white box testing and structural types of testing. It keeps a check at the data receiving points by the variables and its usage points. It is done to cover the path testing and branch testing gap.
The process is conducted to detect the bugs because of the incorrect usage of data variables or data values. For e.g. Initialization of data variables in programming code, etc.
data flow testing
What is Data flow Testing?
  • The programmer can perform numerous tests on data values and variables. This type of testing is referred to as data flow testing.
  • It is performed at two abstract levels: static data flow testing and dynamic data flow testing.
  • The static data flow testing process involves analyzing the source code without executing it.
  • Static data flow testing exposes possible defects known as data flow anomaly.
  • Dynamic data flow identifies program paths from source code.
Let us understand this with the help of an example.
Data flow testing exampe
There are 8 statements in this code. In this code we cannot cover all 8 statements in a single path as if 2 is valid then 4, 5, 6, 7 are not traversed, and if 4 is valid then statement 2 and 3 will not be traversed.
Hence we will consider two paths so that we can cover all the statements.
x= 1
Path – 1, 2, 3, 8
Output = 2
If we consider x = 1, in step 1; x is assigned a value of 1 then we move to step 2 (since, x>0 we will move to statement 3 (a= x+1) and at end, it will go to statement 8 and print x =2.
For the second path, we assign x as 1
Set x= -1
Path = 1, 2, 4, 5, 6, 5, 6, 5, 7, 8
Output = 2
x  is set as 1 then it goes to step 1 to assign x as 1 and then moves to step 2 which is false as x is smaller than 0 (x>0 and here x=-1). It will then move to step 3 and then jump to step 4; as 4 is true (x<=0 and their x is less than 0) it will jump on 5 (x<1) which is true and it will move to step 6 (x=x+1) and here x is increased by 1.
So,
x=-1+1
x=0
x become 0 and it goes to step 5(x<1),as it is true it will jump to step
6 (x=x+1)
x=x+1
x= 0+1
x=1
x is now 1 and jump to step 5 (x<1) and now the condition is false and it will jump to step 7 (a=x+1) and set a=2 as x is 1. At the end the value of a is 2. And on step 8 we get the output as 2.
Steps of Data Flow Testing
  • creation of a data flow graph.
  • Selecting the testing criteria.
  • Classifying paths that satisfy the selection criteria in the data flow graph.
  • Develop path predicate expressions to derive test input.
The life cycle of data in programming code
  • Definition: it includes defining, creation and initialization of data variables and the allocation of the memory to its data object.
  • Usage: It refers to the user of the data variable in the code. Data can be used in two types as a predicate(P) or in the computational form(C).
  • Deletion: Deletion of the Memory allocated to the variables.
Types of Data Flow Testing
  • Static Data Flow Testing
No actual execution of the code is carried out in Static Data Flow testing. Generally, the definition, usage and kill pattern of the data variables is scrutinized through a control flow graph.
  • Dynamic Data Flow Testing
The code is executed to observe the transitional results. Dynamic data flow testing includes:
  • Identification of definition and usage of data variables.
  • Identifying viable paths between definition and usage pairs of data variables.
  • Designing & crafting test cases for these paths.
Advantages of Data Flow Testing
  • Variables used but never defined,
  • Variables defined but never used,
  • Variables defined multiple times before actually used,
  • DE allocating variables before using.
Data Flow Testing Limitations
  • Testers require good knowledge of programming.
  • Time-consuming
  • Costly process.
Data Flow Testing Coverage
  • All definition coverage: Covers “sub-paths” from each definition to some of their respective use.
  • All definition-C use coverage: “sub-paths” from each definition to all their respective C use.
  • All definition-P use coverage: “sub-paths” from each definition to all their respective P use.
  • All use coverage: Coverage of “sub-paths” from each definition to every respective use irrespective of types.
  • All definition use coverage: Coverage of “simple sub-paths” from each definition to every respective use.
Data Flow Testing Strategies

data flow testing strategies
Following are the test selection criteria
1. All-defs: For every variable x and node i in a way that x has a global declaration in  node I, pick a comprehensive path including the def-clear path from node i to
  • Edge (j,k) having a p-use of x or
  • Node j having a global c-use of x
2. All c-uses: For every variable x and node i in a way that x has a global declaration in node i, pick a comprehensive path including the def-clear path from node i to all nodes j having a global c-use of x in j.
3. All p-uses: For every variable x and node i in a way that x has a global declaration in node i, pick a comprehensive path including the def-clear path from node i to all edges (j,k) having p-use of x on edge (j,k).
4. All p-uses/Some c-uses: it is similar to all p-uses criterion except when variable x has no global p-use, it reduces to some c-uses criterion as given below
5. Some c-uses: For every variable x and node i in a way that x has a global declaration in node i, pick a comprehensive path including the def-clear path from node i to some nodes j having a global c-use of x in node j.
6. All c-uses/Some p-uses:it is similar to all c-uses criterion except when variable x has no global c-use, it reduces to some p-uses criterion as given below:
7. Some p-uses: For every variable x and node i in a way that x has a global declaration in node i, pick a comprehensive path including def-clear paths from node i to some edges (j,k) having a p-use of x on edge (j,k).
8. All uses:it is a combination of all p-uses criterion and all c-uses criterion.
9. All du-paths:For every variable x and node i in a way that x has a global declaration in node i, pick a comprehensive path including all du-paths from node i
  • To all nodes j having a global c-use of x in j and
  • To all edges (j,k) having a p-use of x on (j,k).
Data Flow Testing Applications
As per studies defects identified by executing 90% “data coverage” is twice as compared to bugs detected by 90% branch coverage.
The process flow testing is found effective, even when it is not supported by automation.
It requires extra record keeping; tracking the variables status. The computers help easy tracking of these variables and hence reducing the testing efforts considerably. Data flow testing tools can also be integrated into compilers.
Conclusion
Data is a very important part of software engineering. The testing performed on data and variables play an important role in software engineering.  Hence this is a very important part and should be properly carried out to ensure the best working of your product.