Skip to content
Snippets Groups Projects
Commit 4623966b authored by abir.chebbi's avatar abir.chebbi
Browse files

readme

parent 64a49f18
Branches
No related tags found
No related merge requests found
...@@ -4,6 +4,7 @@ ...@@ -4,6 +4,7 @@
1. AWS CLI: Ensure AWS CLI is installed and configured on your laptop(refer to the setup guide provided in Session 1). 1. AWS CLI: Ensure AWS CLI is installed and configured on your laptop(refer to the setup guide provided in Session 1).
2. Ensure python is installed: python 3.8 or higher. 2. Ensure python is installed: python 3.8 or higher.
3. Install required python libraries listed in the 'requirements.txt': 3. Install required python libraries listed in the 'requirements.txt':
`pip3 install -r requirements.txt` `pip3 install -r requirements.txt`
...@@ -11,18 +12,22 @@ ...@@ -11,18 +12,22 @@
### Step 1: Object storage Creation ### Step 1: Object storage Creation
Create an S3 bucket and upload a few PDF files by running: Create an S3 bucket and upload a few PDF files by running:
`python create-S3-and-put-docs.py --bucket_name [YourBucketName] --local_path [PathToYourPDFFiles]` `python create-S3-and-put-docs.py --bucket_name [YourBucketName] --local_path [PathToYourPDFFiles]`
Where: Where:
`--bucket_name`: The name for the new S3 bucket to be created. - **--bucket_name**: The name for the new S3 bucket to be created.
`--local_path`: The local directory path where the PDF files are stored. - **--local_path**: The local directory path where the PDF files are stored.
### Step 2: Vector Store Creation ### Step 2: Vector Store Creation
Create a vector database for storing embeddings by running: Create a vector database for storing embeddings by running:
`python create-vector-db.py --collection_name [Name_of_colletion] --IAM_user [YourIAM_User]` `python create-vector-db.py --collection_name [Name_of_colletion] --IAM_user [YourIAM_User]`
Where: Where:
`--collection_name`: Name of the collection that you want to create to store embeddings. - **--collection_name**: Name of the collection that you want to create to store embeddings.
`--IAM_USER` : For example for group 14 the IAM USER = master-group-14 - **--IAM_USER** : For example for group 14 the IAM USER = master-group-14
This script performs the following actions: This script performs the following actions:
...@@ -35,12 +40,14 @@ This script performs the following actions: ...@@ -35,12 +40,14 @@ This script performs the following actions:
After setting up the S3 bucket and Vector Store, we could process PDF files to generate and store embeddings in the vector database. After setting up the S3 bucket and Vector Store, we could process PDF files to generate and store embeddings in the vector database.
Run: Run:
`python main.py --bucket_name [YourBucketName] --endpoint [YourVectorDBEndpoint]` `python main.py --bucket_name [YourBucketName] --endpoint [YourVectorDBEndpoint]`
Where: Where:
`--bucket_name`: The name of the S3 bucket containing the PDF files.
`--endpoint`: Endpoint for the vector database. - **--bucket_name**: The name of the S3 bucket containing the PDF files.
`--index_name`: The index_name where to store the embeddings in the collection. - **--endpoint**: Endpoint for the vector database.
- **--index_name**: The index_name where to store the embeddings in the collection.
The main.py script will: The main.py script will:
1. Download PDF files from the S3 bucket. 1. Download PDF files from the S3 bucket.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment