Skip to content

Add incremental TF-IDF learning fix for chatbot sample (issue #157)#189

Open
Hardikrepo wants to merge 1 commit into
microsoft:masterfrom
Hardikrepo:fix/chatbot-incremental-tfidf-157
Open

Add incremental TF-IDF learning fix for chatbot sample (issue #157)#189
Hardikrepo wants to merge 1 commit into
microsoft:masterfrom
Hardikrepo:fix/chatbot-incremental-tfidf-157

Conversation

@Hardikrepo

Copy link
Copy Markdown

Summary

  • Adds a community sample fixing the O(n^2) growth issue raised in Chatbot #157: the original chatbot's learn_from_pair refit the entire TF-IDF vectorizer on every call.
  • The fix reuses the already-fitted vectorizer's transform() for new examples and appends via scipy.sparse.vstack (O(1) amortized per call), with a periodic full rebuild (default every 20 additions) to resync vocabulary/IDF weights.

Context

See discussion in #157 for the original script and review comments identifying the O(n^2) issue and suggesting an incremental approach.

Test plan

  • Run community-samples/tfidf-chatbot-incremental-fix/simple_chatbot.py interactively and confirm learn: commands work and responses remain correct after several incremental learns.

Avoids O(n^2) growth from refitting the vectorizer on every
learn_from_pair call by appending via scipy.sparse.vstack and
rebuilding only periodically.
@Hardikrepo

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant