Analytics and predictive modeling are traditionally done on a server or analyst's workstation after importing all of the data or analysis to that host. The need for in-database analytics has emerged as computational resources of the database platforms have increased to provide more capacity for data processing. At the same time, the speed of connections (network switches) between storage devices and computational servers is constantly increasing, providing various options for how and where to perform analyses of big data.
This paper discusses Statistica's native database-agnostic approach to in-database analytic processing. Statistica has traditionally supported external analytics libraries, analytic and algorithm marketplaces, and platforms (through Statistica Native Distributed Analytics Architecture, NDAA). The platform provides a wide range of options for moving analytic algorithms to the data or edge and can leverage the leading open-source libraries for in-memory analytics via Spark. This white paper specifically focuses on the discussion of trade-offs and options for query-based in-database analytics using Statistica and the key factors to consider before choosing a specific approach for big data analytics.