Developing a comprehensive search engine for an industrial machinery company
This article is about how Evrone helped a startup to develop a search engine and a website for selling industrial machinery and equipment
In 2012, Dan Pinto and Dmitriy Rokhfeld created an aggregator that automatically collected information about used industrial machinery and heavy equipment that was for sale.
The idea of creating the service came to Dan when he received a work assignment to search the Internet for a printing machine. The task turned out to be more difficult than expected. You can’t buy such a machine on Ebay, and comparing price lists on a thousand different sites takes a lot of time. So, Dan developed a technology for automatically collecting the information from websites and combining the data into one place. Later, he contacted his childhood friend, Dmitriy, and they launched a startup to build an eCommerce website with data parse that would streamline the process of finding and organizing information about available equipment and machinery.
Creating a single catalog would eliminate the need to comb through thousands of offers on different websites in search of the right option. So, Dan and Dmitriy developed one, using tools to extract data from websites and applications to parse the information. Machinio is an eCommerce search engine that automatically collects and organizes data from different listings into one place, where the seller and the buyer can discuss the details of the transaction.
The first version of the service was written by Dan himself. Over time, as the Machinio team continued to develop the used machinery market, they added more sales managers and earned more and more regular contracts with equipment sellers. Eventually, it became clear that it would be impossible for one person to run the company and manage the project code at the same time. Machinio needed more developers, but at that point, they did not want to slow down their forward momentum by taking the time to hire and acclimate in-house developers. They decided that their best option would be to employ the help of a consulting company that could immediately provide them with highly-experienced developers, allowing them to continue to progress.
So, Machinio turned to Evrone to strengthen their team and help them with the development of a product search platform for selling used equipment. Evrone was responsible for the backend, the core of the startup. Our task was to bolster Machinio's original in-house development processes with our expertise.
The process of collecting and organizing listings from thousands of websites
Machinio uses web scraping tools for the marketplace to automatically collect information from sellers' websites and crawlers to sort pages and save the necessary data to the database. Ferrum, an open-source solution created with the support of Evrone, a high-level API to control Chrome in Ruby, is used. It can help to collect the information for the aggregator from websites that use, for instance, React or Vue.js.
The bigger problem is that product information may be incomplete. For example, certain specifications may be missing or the photo might not match the model name. The crawler just parses catalogs with descriptions of goods, as they are presented on the original website, so the second step is the automatic verification and validation of the data received.
The project uses machine learning to categorize the listings from which the information is received. The model is trained on certain listings and when it is offered a new listing, it analyzes the text and classifies it.
Such systematization is not always required, because sometimes there are very unusual lots that cannot be added to existing categories. Once, a whole factory with equipment worth several million dollars was put up for sale. Of course, for this type of listing, separate characteristics are needed, and using just machine learning is not enough.
Tracking the data changes
Sometimes, a client’s website might stop working, if the markup or technical data has changed. For these instances, the data verification system was developed. If something on the seller’s side has stopped working correctly, they are notified, so they can fix what’s broken. The information is uploaded every day, but not in real time. Selling and buying heavy equipment is quite a slow process, and there are not many large companies with a frequently updated catalog on the market, so real-time updates are not critical.
Machinio was able to create the most comprehensive database of used equipment, for which you need a substantial capacity. Now they have around 10-12 servers, and when that capacity ceases to be enough, the backup servers are connected automatically.
Our part was to develop the whole system in Ruby on Rails. Later, Machinio team decided to use Tensorflow for machine learning, Headless Chrome for crawlers, and Apache Solr for the search. Everything in the system is set up so that the user can search both by the name of specific models and by random requests and characteristics. The product has a microservice architecture, and there is a separate team for working on crawlers.
According to the SimilarWeb service, in February of 2020, the average monthly traffic of the Machinio website was 670,000 users, 71.79% of which came from search engines.
There are similar services on the market, but Machinio has several features that help it attract a larger audience:
- the service is optimized for search engines, so it is easy to find
- the catalog contains offers from 190+ countries, which helps users find the most convenient offers
- buyers are vetted before their leads are processed, and fake requests are screened out
The Machinio startup used aggregators to alter the fundamentals of the equipment sales business and raised over $4.04 million in funding. In 2015, Dan Pinto and Dmitriy Rokhfeld were listed in Forbes: 30 Under 30. In 2017, the company opened its second office in Berlin, and in July 2018, the founders sold Machinio to Liquidity Services, which currently manages its assets.
Machinio is widely-known across the world, it has been featured in the press releases on Forbes, Inc.com, TechCrunch and many other famous global media companies.
At Evrone, we are proud to have helped such an unusual startup to skyrocket. Our expertise helped Machinio develop to the level of functionality that users needed, achieving excellent metrics and meeting their financial goals.