What are predictive analytics

Software infrastructure

The science fiction thriller "Minority Report" shows a method with which the German police also want to hunt down criminals in the future: predictive policing. Using crime patterns from the past few years (location, time of the crime, manner), software calculates, for example, the probability that a break-in will occur in a certain region. The police could then concentrate patrol cars in the area classified as endangered.

Predictive policing is a form of predictive analytics. It is about making predictions on the basis of data models about how a situation will or can develop in the future. Companies also want to be able to predict complex economic relationships in order to make better decisions and gain a competitive advantage.

But what exactly does predictive analytics mean? The term is often used in the context of business intelligence, business analytics and data mining. Additional keywords such as descriptive or prescriptive analytics create additional confusion.

Above: Business Intelligence and Business Analytics

Predictive Analytics is a subset of Business Intelligence (BI) and Business Analytics (BA). BI and BA are often used synonymously, although there are differences in the question and methodology. In principle, business analytics represents a more advanced evolutionary stage of BI. However, business intelligence is often used as a generic term for all forms of data analysis in a company.

  1. The terms around big data
    Big data - what is it actually? Everyone talks about it, everyone understands something different by it. Click through our glossary with the most important and most used terms (some also say "buzzwords") and you will understand what exactly is meant by that.

    compiled by Kriemhilde Klippstätter , freelance author and coach (SE) in Munich
  2. Ad targeting
    Trying to attract the potential customer's attention, mostly through "tailor-made" advertising.
  3. algorithm
    A mathematical formula cast in software with which a data set is analyzed.
  4. Analytics
    With the help of software-based algorithms and statistical methods, data is interpreted. This requires an analytical platform that consists of software or software plus hardware and that provides the tools and computing power to be able to carry out various analytical queries. There are a number of different forms and uses, which are described in more detail in this glossary.
  5. Automatic Identification and Capture (AIDC)
    Any method of automatic identification and data collection about a given condition and subsequent storage in a computer system. For example, the information from an RFID chip that a scanner reads out.
  6. Behavioral Analytics
    Behavioral analytics uses information about human behavior to understand intentions and predict future behavior.
  7. Business Intelligence (BI)
    The general term for the identification, origin and analysis of the data.
  8. Call Detail Record (CDR) analysis
    This contains data that the telecommunications companies collect on the use of mobile phone calls - such as the time and duration of the calls.
  9. Cassandra
    A distributed database management system for very large structured databases (“NoSQL” database system) on an open source basis (Apache).
  10. Clickstream Analytics
    Describes the analysis of a user's web activities by evaluating their clicks on a website.
  11. Competitive monitoring
    Tables that automatically store the activities of the competition on the web.
  12. Complex Event Processing (CEP)
    A process in which all activities in an organization's systems are monitored and analyzed. If necessary, you can react immediately in real time.
  13. Data aggregation
    The gathering of data from different sources for the preparation of a report or for analysis.
  14. Data analytics
    A piece of software that is used to pull information from a data set. The result can be a report, a status or an action that is started automatically.
  15. Data Architecture and Design
    Explains how company data is structured. This usually takes place in three process steps: Conceptual mapping of the business units, logical mapping of the relationships within the business unit and the physical construction of a system that supports the activities.
  16. Data exhaust
    The data that a person generates "on the fly" during their Internet activity.
  17. Data virtualization
    The process of abstracting different data sources through a single layer of access to the data.
  18. Distributed Object
    A piece of software that allows you to work with distributed objects on another computer.
  19. De-identification
    The removal of all data that associates a person with specific information.
  20. Distributed processing
    The execution of a process across different networked computers.
  21. drill
    Apache Drill is an open source SQL search engine for Hadoop and NoSQL data management systems.
  22. Hadoop
    A free framework written in Java by the Apache Foundation for scalable, distributed software in a cluster. It is based on the well-known MapReduce algorithm from Google Inc. as well as suggestions from the Google file system.
  23. HANA
    SAP's software and hardware platform with in-memory computing for real-time analysis and large transaction volumes.
  24. In-database analytics
    In-Database Analytics describes the integration of the analysis methods into the database. The advantage is that the data does not have to be moved for the evaluation.
  25. In-memory database
    Any database system that uses main memory for data storage.
  26. In-Memory Data Grid (IMDG)
    The distributed data storage in the main memory of many servers for fast access and better scalability.
  27. Machine-generated data
    All data that is automatically generated by a computing process, an application or a non-human source.
  28. Map / reduce
    A method in which a large problem is broken down into smaller ones and distributed to different computers in the network or cluster or to a grid of different computers at different locations ("map") for processing. The results are then collected and presented in a (reduced) report. Google has protected its process under the trademark "MapReduce".
  29. Mashup
    Different data sets are combined within an application in such a way that the result is improved.
  30. NoSQL
    Databases that are not structured relationally and with which large volumes of data can be handled. You do not need a fixed table scheme and scale horizontally. For example, Apache Cassandra is a NoSQL.
  31. Operational Data Store (ODS)
    It collects data from different sources so that further operations can be carried out before the data is exported to a data warehouse.
  32. Pattern recognition
    The classification of automatically recognized patterns.
  33. Predictive Analytics
    This form of analytics uses statistical functions in one or more data sets to predict trends or future events.
  34. Recommendation engine
    The customer orders on a website are analyzed using an algorithm and suitable additional products are immediately selected and offered.
  35. Risk Analysis
    The application of statistical methods to one or more data sets in order to be able to estimate the risk of a project, an action or a decision.
  36. Sentiment Analysis
    Entries from people in social networks about a product or company are statically evaluated.
  37. Variable pricing
    The purchase price of a product follows supply and demand. This requires real-time monitoring of consumption and inventory.
  38. Parallel data analysis
    An analytical problem is broken down into subtasks and the algorithms are applied to each problem component simultaneously and in parallel.
  39. Query Anal
    In this process, a search query is optimized in order to get the best possible result.
  40. Reference data
    Data that describe a physically or virtually existing object and its properties.

Business Intelligence (BI) enables companies to answer questions about the current economic situation by systematically collecting, evaluating and presenting company data. Key figures and evaluations at the end of the month or quarter, in combination with target / actual comparisons, support management in making better operational or strategic decisions.