In the wake of mass digitization of data, AceInfo CoE understands the complexity of mining the data to obtain information for efficient decision making and effective broadcasting of the information. Since the inception, Center of Excellence has evolved its expertise in Big Data Computing by conducting in-house clustered pilot implementations using Apache technologies like Hadoop, HBase and Hive. AceInfo's COE has worked with several frameworks and tools to handle large volumes and wide variety of data that is generated at a high velocity can be processed rapidly and easily using multiple methodologies and tools.
The Hadoop Distributed File System (HDFS) with Master and Slave configurations provide high through put access to application data when combined with Hadoop Yarn based MapReduce jobs framework. Large volumes and wide variety of data that is generated at a high velocity can be processed rapidly and easily using multiple MapReduce jobs. These jobs are targeted and pushed to multiple source systems and operational data sources that can be spread across geographic locations or within the organization. Local computations are performed by these jobs on a daily or periodic basis to identify, cleanse and transform the application’s structured or unstructured data into candidate data for extraction and transfer into an Enterprise Data Warehouses (EDWs). Such intelligent distributed data computing eliminates dependency on hardware; instead, the software library itself is designed to detect and handle failures at the application layer, delivering highly-available Business Intelligence and On Demand Predictive and Actionable analytics on top of clustered computers.