Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
chatbot-lab-groupe4
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
leo.pellandi
chatbot-lab-groupe4
Commits
4623966b
Commit
4623966b
authored
8 months ago
by
abir.chebbi
Browse files
Options
Downloads
Patches
Plain Diff
readme
parent
64a49f18
Branches
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+14
-7
14 additions, 7 deletions
README.md
with
14 additions
and
7 deletions
README.md
+
14
−
7
View file @
4623966b
...
@@ -4,6 +4,7 @@
...
@@ -4,6 +4,7 @@
1.
AWS CLI: Ensure AWS CLI is installed and configured on your laptop(refer to the setup guide provided in Session 1).
1.
AWS CLI: Ensure AWS CLI is installed and configured on your laptop(refer to the setup guide provided in Session 1).
2.
Ensure python is installed: python 3.8 or higher.
2.
Ensure python is installed: python 3.8 or higher.
3.
Install required python libraries listed in the 'requirements.txt':
3.
Install required python libraries listed in the 'requirements.txt':
`pip3 install -r requirements.txt`
`pip3 install -r requirements.txt`
...
@@ -11,18 +12,22 @@
...
@@ -11,18 +12,22 @@
### Step 1: Object storage Creation
### Step 1: Object storage Creation
Create an S3 bucket and upload a few PDF files by running:
Create an S3 bucket and upload a few PDF files by running:
`python create-S3-and-put-docs.py --bucket_name [YourBucketName] --local_path [PathToYourPDFFiles]`
`python create-S3-and-put-docs.py --bucket_name [YourBucketName] --local_path [PathToYourPDFFiles]`
Where:
Where:
`
--bucket_name
`
: The name for the new S3 bucket to be created.
-
**
--bucket_name
**
: The name for the new S3 bucket to be created.
`
--local_path
`
: The local directory path where the PDF files are stored.
-
**
--local_path
**
: The local directory path where the PDF files are stored.
### Step 2: Vector Store Creation
### Step 2: Vector Store Creation
Create a vector database for storing embeddings by running:
Create a vector database for storing embeddings by running:
`python create-vector-db.py --collection_name [Name_of_colletion] --IAM_user [YourIAM_User]`
`python create-vector-db.py --collection_name [Name_of_colletion] --IAM_user [YourIAM_User]`
Where:
Where:
`
--collection_name
`
: Name of the collection that you want to create to store embeddings.
-
**
--collection_name
**
: Name of the collection that you want to create to store embeddings.
`
--IAM_USER
`
: For example for group 14 the IAM USER = master-group-14
-
**
--IAM_USER
**
: For example for group 14 the IAM USER = master-group-14
This script performs the following actions:
This script performs the following actions:
...
@@ -35,12 +40,14 @@ This script performs the following actions:
...
@@ -35,12 +40,14 @@ This script performs the following actions:
After setting up the S3 bucket and Vector Store, we could process PDF files to generate and store embeddings in the vector database.
After setting up the S3 bucket and Vector Store, we could process PDF files to generate and store embeddings in the vector database.
Run:
Run:
`python main.py --bucket_name [YourBucketName] --endpoint [YourVectorDBEndpoint]`
`python main.py --bucket_name [YourBucketName] --endpoint [YourVectorDBEndpoint]`
Where:
Where:
`--bucket_name`
: The name of the S3 bucket containing the PDF files.
`--endpoint`
: Endpoint for the vector database.
-
**--bucket_name**
: The name of the S3 bucket containing the PDF files.
`--index_name`
: The index_name where to store the embeddings in the collection.
-
**--endpoint**
: Endpoint for the vector database.
-
**--index_name**
: The index_name where to store the embeddings in the collection.
The main.py script will:
The main.py script will:
1.
Download PDF files from the S3 bucket.
1.
Download PDF files from the S3 bucket.
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment