Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

Sorry, you do not have permission to ask a question, You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please type your username.

Please type your E-Mail.

Please choose an appropriate title for the post.

Please choose the appropriate section so your post can be easily searched.

Please choose suitable Keywords Ex: post, video.

Browse

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise

Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise Logo Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise Logo

Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise Navigation

  • Home
  • About Us
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • About Us
  • Contact Us
Home/ Questions/Q 7494

Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise Latest Questions

Author
  • 60k
Author
Asked: November 28, 20242024-11-28T04:21:06+00:00 2024-11-28T04:21:06+00:00

Implementation of Elastic Search in Django

  • 60k

Introduction

In the first article, we delved into how elastic search works under the hood.

In this article, we will implement elastic search in a Django application.

This article is intended for someone familiar with Django, we will not be explaining setup deeply or functionality such as models and views.

Setup

Clone this repository into a folder of your choosing.

  git clone git@github.com:robinmuhia/elasticSearchPOC.git .   
Enter fullscreen mode Exit fullscreen mode

or

Get the repo from this Github link

We need three specific libraries that we will use as they abstract a lot of what we need to implement elastic search.

  django-elasticsearch-dsl==8.0 elasticsearch==8.0.0 elasticsearch-dsl==8.12.0    
Enter fullscreen mode Exit fullscreen mode

Create a virtual environment, activate it and install the dependencies in the requirements.txt file

  python3 -m venv venv source venv/bin/activate pip install -r requirements.txt   
Enter fullscreen mode Exit fullscreen mode

Your project structure should look like the image below;
Image description

Image description

Now we're ready to go.

Understanding the project

Settings file

The project is a simple Django application. It has your usual setup structure.

In the config folder, we have our settings.py file.
For the purpose of this project, our elastic search settings are simple as shown below;

  INSTALLED_APPS = [     "django.contrib.admin",     "django.contrib.auth",     "django.contrib.contenttypes",     "django.contrib.sessions",     "django.contrib.messages",     "whitenoise.runserver_nostatic",     "django.contrib.staticfiles",     "django_extensions",     "django_elasticsearch_dsl",     "rest_framework",     "elastic_search.books", ]  ELASTICSEARCH_DSL = {     "default": {         "hosts": [os.getenv("ELASTICSEARCH_URL", "http://localhost:9200")],     }, } ELASTICSEARCH_DSL_SIGNAL_PROCESSOR = "django_elasticsearch_dsl.signals.RealTimeSignalProcessor" ELASTICSEARCH_DSL_INDEX_SETTINGS = {} ELASTICSEARCH_DSL_AUTOSYNC = True ELASTICSEARCH_DSL_AUTO_REFRESH = True ELASTICSEARCH_DSL_PARALLEL = False    
Enter fullscreen mode Exit fullscreen mode

In a production ready application, i would recommend using the CelerySignalProcessor. The RealTimeSignalProcessor re-indexes documents immediately any changes are made to a model. CelerySignalProcessor would handle the re-indexing asynchronously to ensure that our users would not have to experience added latency when they modify any of our models. You would have to set up Celery though.

Read more about the nuances of settings here.

Models

  from django.db import models   class GenericMixin(models.Model):     """Generic mixin to be inherited by all models."""      id = models.AutoField(primary_key=True, editable=False, unique=True)     created_at = models.DateTimeField(auto_now_add=True)     updated_at = models.DateTimeField(auto_now=True)      class Meta:         abstract = True         ordering = ("-updated_at", "-created_at") class Country(GenericMixin):     name = models.CharField(max_length=200)      def __str__(self):         return self.name   class Genre(GenericMixin):     name = models.CharField(max_length=100)      def __str__(self):         return self.name   class Author(GenericMixin):     name = models.CharField(max_length=200)      def __str__(self):         return self.name   class Book(GenericMixin):     title = models.CharField(max_length=100)     description = models.TextField()     genre = models.ForeignKey(Genre, on_delete=models.CASCADE, related_name="genres")     country = models.ForeignKey(Country, on_delete=models.CASCADE, related_name="countries")     author = models.ForeignKey(Author, on_delete=models.CASCADE, related_name="authors")     year = models.IntegerField()     rating = models.FloatField()      def __str__(self):         return self.title   
Enter fullscreen mode Exit fullscreen mode

The Generic Mixin has fields that should be inherited by all models. For a production application, i would recommend using a UUID as a primary key but we will use a normal incrementing integer field as it is easier for this project.

The models are pretty self-explanatory but we will be indexing and querying the book model. Our goals are to be able to search for a book using its title and description, while also being able to filter year and rating.

Documents file

We have documents.py file in the books folder.
This folder is important and should be named as such. Our documents will be written here. For our book model, the code is shown below;

  from django_elasticsearch_dsl import Document, fields from django_elasticsearch_dsl.registries import registry  from elastic_search.books.models import Author, Book, Country, Genre   @registry.register_document class BookDocument(Document):     genre = fields.ObjectField(         properties={             "name": fields.TextField(),         }     )     country = fields.NestedField(         properties={             "name": fields.TextField(),         }     )     author = fields.NestedField(         properties={             "name": fields.TextField(),         }     )      class Index:         name = "books"      class Django:         model = Book         fields = [             "title",             "description",             "year",             "rating",         ]          related_models = [Genre, Country, Author]      def get_queryset(self):         return super().get_queryset().select_related("genre", "author", "country")      def get_instances_from_related(self, related_instance):         if isinstance(related_instance, Genre):             return related_instance.genres.all()         elif isinstance(related_instance, Country):             return related_instance.countries.all()         elif isinstance(related_instance, Author):             return related_instance.authors.all()         else:             return []    
Enter fullscreen mode Exit fullscreen mode

Import Statements:

We import necessary modules and classes from django_elasticsearch_dsl and our Django models.

Document Definition:

We define a BookDocument class which inherits from Document, provided by django_elasticsearch_dsl.

Registry Registration:

We register the BookDocument class with the registry using the @registry.register_document decorator. This tells the Elasticsearch DSL library to manage this document.

Index Configuration:

We specify the name of the Elasticsearch index for this document as “books”. This index name should be unique within the Elasticsearch cluster.

Django Model Configuration:

Under the Django class nested within BookDocument, we link the document to the Django model (Book) and specify which fields of the model should be indexed.

Fields Mapping:

Inside the BookDocument class, we define fields for the Elasticsearch document. These fields map to the fields in the Django model. Some fields, such as genre, country, and author, are nested objects.

Related Models Handling:

We specify related models (Genre, Country, Author) that should be indexed along with the Book model. For each related model, we define how to retrieve instances related to the main model. This involves specifying which fields to index from related models.

Queryset Configuration:

We override the get_queryset method to specify how the queryset should be retrieved. In this case, we use select_related to fetch related objects efficiently.

Instances from Related:

We define the get_instances_from_related method to handle instances from related models. This method is used to retrieve instances related to the main model for indexing purposes.

Views

  import copy from abc import abstractmethod  from elasticsearch_dsl import Document, Q from rest_framework.decorators import action from rest_framework.pagination import LimitOffsetPagination from rest_framework.request import Request from rest_framework.response import Response from rest_framework.viewsets import ModelViewSet  from elastic_search.books.documents import BookDocument from elastic_search.books.models import Book from elastic_search.books.serializers import BookSerializer   class PaginatedElasticSearchAPIView(ModelViewSet, LimitOffsetPagination):     document_class: Document = None      @abstractmethod     def generate_search_query(self, search_terms_list, param_filters):         """This method should be overridden         and return a Q() expression."""      @action(methods=["GET"], detail=False)     def search(self, request: Request):         try:             params = copy.deepcopy(request.query_params)             search_terms = params.pop("search", None)             query = self.generate_search_query(                 search_terms_list=search_terms, param_filters=params             )              search = self.document_class.search().query(query)             response = search.to_queryset()              results = self.paginate_queryset(response)             serializer = self.serializer_class(results, many=True)              return self.get_paginated_response(serializer.data)         except Exception as e:             return Response(e, status=500)   class BookViewSet(PaginatedElasticSearchAPIView):     serializer_class = BookSerializer     queryset = Book.objects.all()     document_class = BookDocument      def generate_search_query(self, search_terms_list: list[str], param_filters: dict):         if search_terms_list is None:             return Q("match_all")         search_terms = search_terms_list[0].replace("x00", "")         search_terms.replace(",", " ")         search_fields = ["title", "description"]         filter_fields = ["year", "rating"]         query = Q("multi_match", query=search_terms, fields=search_fields, fuzziness="auto")          wildcard_query = Q(             "bool",             should=[                 Q("wildcard", **{field: f"*{search_terms.lower()}*"}) for field in search_fields             ],         )         query = query | wildcard_query          if len(param_filters) > 0:             filters = []             for field in filter_fields:                 if field in param_filters:                     filters.append(Q("term", **{field: param_filters[field]}))             filter_query = Q("bool", should=[query], filter=filters)             query = query & filter_query          return query    
Enter fullscreen mode Exit fullscreen mode

Structure

The PaginatedElasticSearchAPIView class has two important methods. The generate search query method has an abstractmethod decorator which means that any class that inherits it has to implement said method.

The other search method adds an endpoint search that accepts a get request and handles the search functionality . It copies the parameters from the URL and then passes the parameters to the generate search query function. The function should return an Elasticsearch Query which will be searched from and then converted to a queryset. The queryset will be paginated over and returned to the user.

In a production app, i would recommend handling the exception by logging the error and defaulting to use Django Rest Framework's search so at the least our search will always work.

Implementation

In the BookViewSet, we provide the document that we will execute the search on.

We also implement the abstract method. Let us explain the query one by one.

Input Parameters:

search_terms_list: These are the words or phrases a user types into the search bar when looking for a book.

param_filters: These are additional filters or conditions a user might want to apply to narrow down their search, like searching only for books published in a certain year or have a certain rating.

Understanding the Search Process:

If the user doesn't provide any search terms, it means they want to see all the books available. So, we create a “match-all” query to fetch all books.

If the user provides search terms, we want to look for those terms in specific fields of our books, like title or description. We also want to be flexible with our search, allowing for slight misspellings or variations in the search terms. That's where the “fuzziness” parameter comes into play. It helps us find similar words even if the user misspells something.

Additionally, we might want to support wildcard searches, where a user can use placeholders like '' to match any characters. For example, searching for 'hist' would match 'history', 'historic', etc.

If there are any filter parameters provided, we want to apply those filters to our search results. For example, if a user wants to see only books published in the year 2022, we want to include that condition in our search.

Constructing the Query:

We use the Elasticsearch DSL (Domain-Specific Language) to construct our search query. This query is like a set of instructions written in a language Elasticsearch understands.
We build our query step by step, considering all the different scenarios mentioned above.
We use the Q class from Elasticsearch DSL to create different parts of our query, such as match queries, wildcard queries, and filter queries.
Finally, we combine all these parts to form a comprehensive search query that captures both the user's search terms and any additional filters they might have applied.

Output:

The method returns the constructed search query, ready to be executed against our Elasticsearch index.
This query will fetch the relevant books based on the user's search terms and filters, providing them with accurate and tailored search results.

URLS

We now setup this up in our urls.py file;

  from rest_framework.routers import SimpleRouter  from elastic_search.books import views  router = SimpleRouter()  router.register("books", views.BookViewSet)  urlpatterns = router.urls   
Enter fullscreen mode Exit fullscreen mode

Data

We need data to search against and thus there is a factories.py file that will populate the data for us in a db.

First lets create a database;
Set up postgres and run the following commands;

  sudo -u postgres psql DROP USER IF EXISTS elastic;  CREATE USER elastic WITH CREATEDB CREATEROLE SUPERUSER LOGIN PASSWORD 'elastic';  DROP DATABASE IF EXISTS elastic;  CREATE DATABASE elastic WITH OWNER postgres;  GRANT ALL ON DATABASE elastic TO elastic; q    
Enter fullscreen mode Exit fullscreen mode

Populate the data in the db;

  python manage.py generate_test_data 1000   
Enter fullscreen mode Exit fullscreen mode

This will create a large dataset for us to run our queries against

Set up elastic search

Run the following to start a local elasticsearch instance with docker

  docker run --rm --name elasticsearch_container -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" elasticsearch:8.10.2   
Enter fullscreen mode Exit fullscreen mode

Populate index

Now we can populate our index to test out the application

  python manage.py search_index --rebuild   
Enter fullscreen mode Exit fullscreen mode

Query Time!!!

Start the server

  python manage.py runserver   
Enter fullscreen mode Exit fullscreen mode

Head to postman or any API testing platform of your choice

Our base query will be this

  http://localhost:8000/api/books/search/   
Enter fullscreen mode Exit fullscreen mode

A get request is shown below;

Image description

lets make a query for a movie with consumer;

Image description

Lets misspell consumer, we get same result

Image description

lets test the filter;

Image description

Conclusion

We have implemented elastic search and tested it live. We have got expected results. There exists other queries such as nested queries that can be added to include author and country into the search and filters but they are out of the scope of this tutorial. In a future article, i may add them. However, in our next article, we will add a CI/CD pipeline that can be used to test our application.

djangoelasticsearchpythonwebdev
  • 0 0 Answers
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

Sidebar

Ask A Question

Stats

  • Questions 4k
  • Answers 0
  • Best Answers 0
  • Users 2k
  • Popular
  • Answers
  • Author

    Insights into Forms in Flask

    • 0 Answers
  • Author

    Kick Start Your Next Project With Holo Theme

    • 0 Answers
  • Author

    Refactoring for Efficiency: Tackling Performance Issues in Data-Heavy Pages

    • 0 Answers

Top Members

Samantha Carter

Samantha Carter

  • 0 Questions
  • 20 Points
Begginer
Ella Lewis

Ella Lewis

  • 0 Questions
  • 20 Points
Begginer
Isaac Anderson

Isaac Anderson

  • 0 Questions
  • 20 Points
Begginer

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help

Footer

Querify Question Shop: Explore Expert Solutions and Unique Q&A Merchandise

Querify Question Shop: Explore, ask, and connect. Join our vibrant Q&A community today!

About Us

  • About Us
  • Contact Us
  • All Users

Legal Stuff

  • Terms of Use
  • Privacy Policy
  • Cookie Policy

Help

  • Knowledge Base
  • Support

Follow

© 2022 Querify Question. All Rights Reserved

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.