What is Elasticsearch: Intro to Search in Data and ElasticsearchMay 03, 2022
Want to become more technical in just 5 weeks? Find out how the Skiplevel program can help.
What is Search in data?
How long would it take you to identify all the products that contain the letter 'b' from this list of 7 products: "Bottle", "Tablet","Dresser","Hairbrush","Toy truck","Swing set","Apple Cider"? Probably a second. Now imagine a big e-commerce store like Amazon that sells millions of products. How long would it take you to filter through those products? Chances are you wouldn't even try!
While computers can search faster than our feeble human brains can, they still struggle to filter through the trillions of bytes of data companies receive. For companies with big data who rely on search for their business needs, this is a problem that their development teams need to solve.
So how do they do it?
Search and Analytics Engines: Introducing Elasticsearch
Development teams typically integrate with third party search and data analytics platforms for searching capabilities. A popular solution used by many engineering teams is Elasticsearch, an industry leader in search and analytics ever since its launch in 2010.
Elasticsearch allows us to store, search, and analyze huge volumes of data quickly and in near real-time. They do this by indexing* data which makes searching data much faster. Elasticsearch is also known to be reliable/fault tolerant due to its automatic replication* of data on multiple nodes. It also has REST APIs and a UI to perform CRUD* and search operations in the cluster*–making it easy to use.
Want to deep dive into Elasticsearch with a demo? Watch this video from the Official. Elastic community.
Definitions of tech terms*:
CRUD: Stands for Create, Read, Update, Delete. Common tech terminology for working with data and API operations.
Cluster: Group of two more servers (physical or software) that act like a single system. Ex: 3 of the same servers hosting the same application for horizontal scaling.
Node: A specific server in a cluster (Could be physical or a VM).
Replication: Process of making multiple copies of data and storing it in different location for backup purposes. Part of a fault tolerant design.
Data Indexing: Segmenting data into logical groupings (i.e. toys, furniture, gift) allowing quick retrieval of data vs. querying through an entire data set.
Become more technical without learning to code with the Skiplevel program.
The Skiplevel program is specially designed for the non-engineering professional to give you the strong technical foundation you need to feel more confident in your technical abilities in your day-to-day role and during interviews.