Big Data, Infrastructure, and Performance
Organizations need to carefully study the effects of big data, advanced analytics, and artificial intelligence on infrastructure choices.
- By Brian J. Dooley
- March 13, 2018
As new data-intensive forms of processing such as big data analytics and AI continue to gain prominence, the effect on your infrastructure will grow as well. Rising data volumes and velocity strain the limits of current infrastructure -- from storage and data access to networking, integration, and security.
When undertaking a modernization project, the need for infrastructure that can handle big data in real-time is likely to become significant, particularly as companies attempt to combine these advanced requirements with movement to the cloud. We discussed the implications and challenges with Jim D'Arezzo, CEO of Condusiv Technologies, and Mark Gaydos, chief marketing officer of Nlyte Software. Condusiv offers software-only storage performance solutions for virtual and physical server environments and Nlyte focuses on data center infrastructure management (DCIM).
Performance Optimization
"We're being swamped by a tsunami of data," says D'Arezzo. "Data center consolidation and updating is a challenge. There are issues such as data access speeds, compatibility with previous architectures, replication and backup, and cost of data storage. We run into cases where organizations do consolidation on a 'forklift' upgrade basis, simply dumping new storage and hardware into the system as a solution. Shortly thereafter, they often realize that performance has degraded. A bottleneck has been created that needs to be handled with optimization."
Condusiv is in the business of boosting Windows system performance whether on a desktop or in a server-based environment. Recently, the company's focus has been on reducing I/O in the computing environment to improve overall performance. Their approach is to reduce I/O across compute, network, and storage layers using a software-only reduction strategy. Condusiv guarantees a performance improvement of 50 percent or better.
"From our perspective, the first issue in storage architecture is the network," says D'Arezzo. "The network is critical, with generally 10 GB speeds and 40 GB plus available but expensive. One big wildcard is movement to the cloud and determining what's on premises, what is to be moved to outside cloud storage, and how that affects movement of data, analytics, and data security."
D'Arezzo notes that AI and machine learning provide unique challenges. "The thing about AI is that it requires massive amounts of data, compute power, and I/O," he says. "This is due to the amount of data necessary to have a proper AI solution. AI also needs speedy access to sufficient compute resources and has very intensive I/O."
Condusiv has two patented software engines, one improving reads and one improving writes to data storage. Reads use the DRAM on the system, which is 15 times faster than SSD. The more DRAM that can be put to good use, the less pressure will be put on storage architecture and networks -- which are typical bottlenecks. Condusiv uses cache reads to significantly improve performance.
Infrastructure Management
Mark Gaydos of Nlyte Software offers a perspective from a more generalized point of view.
"Ultimately you need to aggregate data from various systems in varying formats and then normalize that data into information that is actionable," he says. "You then need to tie this information into where workloads are running so you have a physical-to-virtual-to-logical view of your data center. Only then can you begin to make informed decisions about consolidation."
Machine learning requires special measures to ensure optimal operation. "For an organization to benefit from machine learning in the data center, they need to be able to capture varying types of information from disparate systems and normalize that information first," Gaydos explains. "They need to be able to feed that data into some type of machine learning system, such as Watson, and be able to consume and decipher the information coming out to turn the analytics into real action."
Knowledge is power, and data centers often lack tools such as DCIM that provide holistic insight into what and how they should upgrade. Lacking this information, it is difficult to justify -- or accomplish -- systematic modernization of computing infrastructure.
"Initiatives such as the Data Center Optimization Initiative (DCOI) in the U.S. federal government and Europe's General Data Protection Regulation (familiarly known as GDPR) are driving organizations to look at how they manage systems and data within their computing infrastructure," says Gaydos. "Many organizations are seeing this as an opportunity to deploy DCIM to modernize how they manage their infrastructure."
A Final Word
With the numerous considerations of storage and I/O, it is clear that organizations need to carefully study the effects of big data, advanced analytics, and artificial intelligence on infrastructure choices. Without sufficient infrastructure in place, the system will be unable to efficiently cope with the fire hose of data, and bottlenecks will emerge that could affect other systems. At the same time, data center upgrades must be cost-effective because changes can be expensive in implementation as well as in potential downstream effects. The data center is an integrated system that requires a holistic upgrade strategy.
About the Author
Brian J. Dooley is an author, analyst, and journalist with more than 30 years' experience in analyzing and writing about trends in IT. He has written six books, numerous user manuals, hundreds of reports, and more than 1,000 magazine features. You can contact the author at [email protected].