Podisen is a WhatsApp chatbot that creates a personalized AI clone of you by learning from your whatsapp chat history.
With enough quality data, it will deeply understands how you respond, communication style & preferences. You can deploy your clone to a dedicated whatsapp account. Feel free to reach out if you have any questions or suggestions!
- AI powered Data Organization & Conversation Handling 💫: We don't do just questions and answers. We have multi message conversations. Podisen organize your raw_data with an LLM so multi-message chats stay together and he understands full conversations, not just single messages.
- Automatic Data Cleanup: Cleans up about 90% of your chat data without you having to do it manually.
- Deep Personalization: With good enough data, the model will learns your specific way of communication, making responses that sound just like you.
- End to End Whatsapp deployed Chatbot.
- Python 3.10 or higher
- Google Cloud Platform account
- WhatsApp Business API access
- Facebook Developer account
- Docker installed locally
- Clone the repository:
git clone https://github.com/yourusername/podisen.git
cd podisen
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
Create a
.env
file in the root directory with the following variables:
WHATSAPP_TOKEN=your_whatsapp_token
VERIFY_TOKEN=your_verify_token
PROJECT_ID=your_gcp_project_id
LOCATION=your_gcp_location
MODEL_ID=your_model_id
PHONE_NUMBER_ID=your_phone_number_id
The project includes a Jupyter notebook (0-data-processing/Complete-Data-Processing.ipynb
) to process your WhatsApp chat exports into a training dataset.
-
Export your WhatsApp chats:
- Open WhatsApp
- Go to a chat, Click on the top right side 3 dots > More > Export Chat > Without media
- Save the exported files in
whatsapp_data/raw_chats/
-
Run the data processing notebook:
- Open
0-data-processing/Complete-Data-Processing.ipynb
- Set your name in the environment variables
- Run all cells to process your chats
- The processed data will be saved in
whatsapp_data/processed/
- Open
-
Additional Manual data cleaning.
- Search for "" and replace it with nothing.
- If your chat history includes any sensitive information (like personal names or phone numbers), consider anonymizing or removing those entries.
- Remove any unnecessary metadata or timestamps that may not be relevant for training.
- Check for and delete any duplicate messages that may have been exported.
- Ensure that all messages are in a consistent format (e.g., all lowercase or proper casing).
- Look for any special characters or emojis that may not be processed correctly and decide whether to keep or remove them.
- Save the cleaned data in a new file to avoid overwriting the original exported chats.
IMPORTANT: Good data means a good output responses. So please pay attention to your dataset. For hight quality personalization I recommend to include at least 10,000 entires across various whatsapp chats.
- Create a new GCP project
- Enable the Vertex AI API
- Create a Cloud Storage bucket
- Upload your processed JSONL file to the bucket
- Navigate to Vertex AI in Google Cloud Console
- Select "Tuning" under "Models"
- Choose a base model (PaLM 2 or Gemini)
- Configure your fine-tuning job:
- Select your JSONL file
- Set hyperparameters
- Choose training budget
- Start the fine-tuning process
- Deploy the model as an endpoint in Vertex AI
- Note the model endpoint details for the bot implementation
- Register for WhatsApp Business API access
- Complete the verification process
- Set up a WhatsApp Business account
- Go to Facebook Developers
- Create a new app or select existing one
- Add WhatsApp product to your app
- Configure Webhook:
- Set Callback URL to your deployed service URL +
/webhook
- Set Verify Token (same as in your .env file)
- Subscribe to messages events
- Set Callback URL to your deployed service URL +
- In Facebook Developer Console, go to WhatsApp > Getting Started
- Find your Phone Number ID in the API Setup section
- Add this ID to your .env file
- Build and push the Docker image:
gcloud builds submit --tag location-docker.pkg.dev/projectId/whatsapp-bot
- Deploy to Cloud Run: Make sure to use the project ID in text form, not the numeric project ID. Otherwise you'll get an error while deploying. (GCP side) Run the below command as one line
gcloud run deploy whatsapp-bot \
--image location-docker.pkg.dev/projectId/whatsapp-bot \
--platform managed \
--region location \
--allow-unauthenticated \
--set-env-vars="WHATSAPP_TOKEN=token,VERIFY_TOKEN=Vtoken,PROJECT_ID=projectIDinName,LOCATION=location,MODEL_ID=modelID"
- After deployment, your webhook URL will be:
https://your-service-url/webhook
- In Facebook Developer Console:
- Go to WhatsApp > Configuration
- Enter your webhook URL
- Enter your Verify Token
- Click "Verify and Save"
- Test your bot by sending a message to your WhatsApp Business number
- Make sure to use the project ID in text form, not the numeric project ID
- Keep your tokens and API keys secure
- Monitor your Cloud Run logs for any issues
- The bot maintains conversation history for context
Contributions are welcome! This is an open-source project, and I'd love to see your ideas and improvements. Here's how you can contribute:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Please make sure to update tests as appropriate and adhere to the existing code style.
This project is licensed under the MIT License - see the LICENSE file for details.
Hi! I'm Geethika Isuru, an AI Engineer & Entrepreneur who's trying to make a better world with AI.