Bayard is a Flask application that provides a natural language processing API endpoint. It leverages the power of Elasticsearch for document search and OpenAI's language models for generating contextual responses based on user input and relevant documents. The application is designed to be scalable and robust, with a focus on efficient data storage and retrieval.
At its core, the Bayard App consists of a Flask server that exposes a single API endpoint:
/api/bayard
. This endpoint accepts POST requests with a JSON payload containing the input_text
field, which represents the user's natural language input.
python@app.route("/api/bayard", methods=["POST"]) def handle_bayard_request(): input_text = request.json.get("input_text") # ... process input_text ...
Upon receiving a request, the application first searches Elasticsearch for relevant documents based on the
input_text
. Elasticsearch is a powerful search engine that allows for efficient and flexible document retrieval, ensuring that the most relevant context is provided to the language model.
pythonfiltered_docs = search_elasticsearch(input_text)
Next, the application leverages OpenAI's language model to generate a contextual response, taking into account both the
input_text
and the relevant documents retrieved from Elasticsearch. OpenAI's state-of-the-art natural language processing models are capable of understanding and generating human-like text, providing accurate and insightful responses.
pythonmodel_output_json = generate_model_output(input_text, filtered_docs) model_output_data = json.loads(model_output_json)
To ensure data persistence and enable potential future analysis, the application stores each model run in a PostgreSQL database hosted on Render. The database schema includes fields for the run_id, timestamp, input_text, and model_output, allowing for efficient storage and retrieval of model run data.
pythonwith db_connection.cursor() as cursor: cursor.execute(""" INSERT INTO model_runs (run_id, timestamp, input_text, model_output) VALUES (%s, %s, %s, %s) ON CONFLICT (run_id) DO UPDATE SET timestamp = EXCLUDED.timestamp, input_text = EXCLUDED.input_text, model_output = EXCLUDED.model_output; """, (run_id, timestamp, input_text, model_output_data['modelOutput'])) db_connection.commit()
Finally, the endpoint returns a JSON response containing the
runId
, timestamp
, modelOutput
, and relevantDocuments
, providing the client with the necessary information for further processing or display.The Bayard App is designed with scalability and robustness in mind, leveraging industry-standard technologies and best practices for natural language processing, data storage, and API development. Its modular architecture and integration with powerful services like Elasticsearch and OpenAI make it a versatile solution for various natural language processing tasks.
Elasticsearch UtilityOpenAI UtilityWL1.0GP License TermsEthical Use Policy for the Bayard Corpus