Select Page

Apache Mahout

What is it?

Apache Mahout is an Apache Hadoop-based library for machine learning. Mahout operates on the basis of the MapReduce algorithm, but it is not at all limited by it. The module contains a set of basic libraries which are also optimized in terms of undistributed algorithms.

The library, for the most part, draws on the Apache Hadoop system, which makes it possible to operate on large data sets. When data is stored in HDFS, Mahout is the right solution in the data science category. It can be used to automatically find patterns in those sets and to extrapolate suitable business value. According to the authors – the aim of the project is to ensure an easy path to converting big data into big information.

Mahout supports four main uses, but it is not at all limited to them:

Recommendation

searching content suited to individual users

Clustering

for instance grouping similar documents in terms of subject or attributes

Classification

the system learns on the basis of existing documents with assigned categories and then it is able to allocate appropriate categories to new documents

Frequent itemset mining

makes it possible to find elements which usually appear together. For example, a shopping cart at an online shop, in which the algorithm searches for items which often appear together during one session and suggests them to the client.

Mahout may be used in parallel with other machine learning libraries (e.g. Apache Spark) and it has appropriate connectors to easily exchange data between systems.

What undoubtedly is an advantage is that the system operates on an Apache open source licence and it can be used in commercial products.

Our experience

BlueSoft successfully uses the Apache Mahout technology at its clients representing such industries as financial, telecoms or life science, while our expertise allows us to fully utilize its possibilities.

Our company has ample experience in the realm of business analysis, which helps our clients choose appropriate issues that can be improved using machine learning algorithms. A team of experienced programmers deploys them while keeping the costs in check.

BlueSoft has successfully implemented many projects in this area. We will happily present our portfolio directly as well as answer more questions about technology itself and benefits to be brought by its implementation.

See other technologies, which we use in this area

Machine Learning