Talk by Dr Gautam Shroff VP and Chief Scientist of Tata Consultancy Services
We define, describe and motivate an emerging business intelligence need that we call Enterprise Information Fusion, which involves exploiting multi-structured external and internal data sources. For example, as a consequence of the growth and popularity of social media such as Twitter, news events of even minor or highly local import are often reported here by reporters as well as the general public. For example, a fire in a supplier’s factory halfway around the world may be of interest even from afar. Similarly, conversations in specialized blogs and discussion forums often mention specific faults or difficulties being faced by consumers of products or services. In each case, many such events form a ‘long tail’, i.e., they may never gather significant support; yet being able to respond early to such events can be significant business value across diverse industries, such as manufacturing, retail or insurance. We first describe how this long-tail of events can be detected using a combination of locality-sensitive hashing, information extraction and machine-learning. Next, we argue that in order to extract longer term insights from such information so as to be able to predict if a new event is likely to have an impact, such as a dip in sales in some slice of the market, external events need to be continuously correlated with multiple, often unrelated sources. In order to do so, however, such disparate data sources first need to be harmonized to a common level of granularity. We describe how to harmonize data sources using map-reduce, but in an approximate fashion via machine-learning, as well as incrementally, as is often required in practice. Finally we conclude with some more applications of Big-Data analytics in large enterprises, why the new technology matters, and an outline of other related projects at TCS Research.
Dr. Gautam Shroff is Vice President & Chief Scientist, Tata Consultancy Services and heads TCS’ Innovation Lab in Delhi, India, He is also an adjunct faculty at IIT and IIIT Delhi.
Prior to joining TCS in 1998, Dr. Shroff had been on the faculty of the California Institute of Technology, Pasadena, USA and thereafter of the Department of Computer Science and Engineering at Indian Institute of Technology, Delhi, India. He has also held visiting positions at NASA Ames Research Center in Mountain View, CA, and at Argonne National Labs in Chicago. In 1994 he was conferred the "Young Scientist Award" from the Department of Atomic Energy (Govt. of India).
Dr. Shroff has published over 30 research papers in the areas of computational mathematics, parallel computing, distributed systems, software architecture, software engineering, and information fusion. He has also written a book “Enterprise Cloud Computing” published by Cambridge University Press in 2010. Dr. Shroff graduated from the Indian Institute of Technology, Kanpur, India, in 1985 and received his Ph.D. in Computer Science from Rensselaer Polytechnic Institute, NY, USA, in 1990.