Create Your "ownGPT" in a Protected Way and Advance Its Potential (Part 2) - Incorporating your own data to create unique experiences

second part of a series
part 1 (the basics and installing the gui) of the series was published a week earlier

Motivation

In this second part of the series, we will learn how to customize our chatGPT clone with our own data using Azure OpenAI Service (AOAI) and some Azure services. We will use Azure Blob Storage to store our data, Azure Cognitive Search to index and query it, and Azure Search Index to fine-tune our GPT model. This way, we can make our chatbot more relevant and personalized for our users.

Add your data the easy way

  • Go to Azure OpenAI Studio, head over to the playground, and choose Chat

  • Click on the Assistant Setups entry "Add your Data"

  • Click on "Add a data source"

  • Choose "Upload files" and follow the wizard to create an "Azure Blob Storage," an "Azure Cognitive Search Resource" with the ARM templates of the wizard.

  • Enable CORS

  • and assign a meaningful name to the search index.

  • After acknowledging the information, click "Next."

  • Next, upload some test data to the blob storage to test the installation.

  • After finishing the upload, everything will be created in the background, and the Assistant setup tells you to wait some minutes

Once the wizard has completed its task, you'll see the created resource in the status window beneath the wizard. You can then restrict the response to your data and verify its functionality with a brief chat session.

Add the resources to the ownGPT web interface

Test it locally first

Update your .env file and add the following details:

restart your ownGPT by executing ./start.cmd

and try it in your browser at https://127.0.0.1:5000

The result should look similar to the picture below

Update the config of the Web App

  • go to Settings / Applications, and add the "AZURE_Search*" elements from before as application settings. Do not forget to restart the web app.

  • Test the web app in a private browser and see that your identity provider kicks in.

  • The test in the web chat should then give you the same result as testing the chat interface locally.

Things you should consider

With this configuration, all information is visible to all of your authenticated users; you should now limit access to information by introducing "document-level access control".

Another step will be to use semantic search instead of keyword search. I will cover this in a later article in this series.

What's next

Within the following weeks, we will explore together the following elements of creating your "ownGPT"

  • implementing chat history

  • adding semantic search

  • adding additional channels

Conclusion

Incorporating your own data into your chatGPT clone using Azure OpenAI Service and other Azure services can enhance its relevance and personalization. By following the steps outlined in this article, you can easily add your data, test it locally, and update your web app configuration. Remember to consider document-level access control to ensure that sensitive information is only accessible to the appropriate users.

Did you find this article valuable?

Support Holger Imbery by becoming a sponsor. Any amount is appreciated!