Order ID 53563633773 Type Essay Writer Level Masters Style APA Sources/References 4 Perfect Number of Pages to Order 5-10 Pages
COMP 10261: The FAQ Bot Plus Project Sam Scott, Mohawk College, January 2022
An FAQ Bot answers questions about a particular topic. It is a
conversational interface to a stock set of questions and answers.
When an FAQ Bot receives an utterance, it determines the
user’s intent by matching that utterance to one of its stored
question and answer pairs. If it succeeds in determining intent in
this way, it uses the answer as its response. In the example on
the right (from Vajjala et al.’s Practical Natural Language
Processing) the FAQ Bot has determined that the first two
utterances have the same intent and has responded with the
same text in both cases.
If an FAQ Bot fails to determine intent, it usually outputs a
standard message to let the user know that it does not know the
answer. But your FAQ Bot Plus will use linguistic knowledge
from spaCy to get a bit chattier in this case.
This handout brings together all the project requirements for
the final project submission.
PHASE 1: FAQ BOT
In this phase, the goal is to update your Phase 0 FAQ Bot using fuzzy regular expressions to determine a
- From Phase 0 (Should already be complete). Determine your FAQ Bot’s knowledge domain and
prepare a set of 20 question and answer pairs. One easy way to do this is to find a long
Wikipedia page and copy sections of 1 to 3 sentences as each answer and generate a question
to go with each answer. Make sure you reference all online sources in comments.
- Generalize by generating at least one more possible question for each answer. Ideally, the new
question should have a different wording, representing another way a user might ask for the
information in the answer.
- Create a fuzzy regular expression for each answer that is capable of matching key parts of both
possible questions and is tolerant to a limited number of typos in each question.
- Store questions, answers, and regular expressions in text files.
- Create a Python program (or modify your Phase 0 FAQ Bot) to load the answers and regular
expressions from files, then allow the user to make utterances. Try to find the best match for
the user’s utterance from your list of regular expressions and output the corresponding answer
Are there limits to the size of dataset I can use for training?
Amazon Machine Learning can train models on datasets up to 100GB in size.
What is the maximum size of training dataset?
Amazon Machine Learning can train models on datasets up to 100GB in size.
What algorithm does Amazon Machine Learning use to generate models?
Amazon Machine Learning currently uses an industry standard logistic regression algorithm to generate models.
as a response. When there are multiple matches, you should have some strategy for
determining which match is better.
- The bot should also respond to “hello” by greeting the user, and “goodbye” or “quit” by ending
the program. If it fails to match an utterance, the bot should politely let the user know that it
didn’t recognize their question.
Test your bot as much as possible. Use the original question, the alternate wordings, and any other
wordings you can think of. If possible, give the bot to a friend or family member to play with and see
how well it works for them. Tweak your regular expressions as necessary to get the best possible
PHASE 2: FAQ BOT PLUS
In this phase, the goal is to make the FAQ Bot a bit chattier or human-like using linguistic knowledge
from the spaCy module. It should still answer the user’s questions as before, but if it fails to figure out a
user’s intent, it should employ a range of strategies to try craft an appropriate response. This part of the
project is open-ended and creative, but you must make use of the spaCy pattern matcher with parts of
speech and/or lemmas in at least one part of your bot.
NAMED ENTITY RECOGNITION AND NOUN CHUNKS When the bot don’t know what the user is talking about, Named Entity Recognition or even Noun
Chunks could help implement a fallback strategy. Here are some examples:
Utterance: Does the college have a relationship with Twitter?
(SpaCy reports that Twitter is an organization – label ORG)
Response: Sorry I don’t know. I don’t work for Twitter.
Utterance: Does Chicago have any colleges?
(spaCy reports that Chicago is a geo-political entity – label GPE)
Response: Sorry, I don’t know. I’ve never been to Chicago.
Utterance: Where is the general store located?
(spaCy finds the noun chunk “the general store”)
Response: Sorry, I don’t know anything about the general store.
SPEECH ACT CLASSIFICATION To make the bot seem chattier or more human-like when it fails to match a user intent, you could
attempt to classify the speech act of the utterance. You can think of a speech act as a very high-level
intent that indicates what kind of action is the user trying to accomplish with their utterance. For
example, they could be asking a question, making a command, promising something, agreeing or
disagreeing with the bot, greeting the bot, etc. You might be able to figure this out by developing some
linguistic patterns in spaCy.
If the bot cannot determine the user’s intent using fuzzy regular expressions, it would at least be useful
to figure out if they are asking a question, trying to give you a command, or simply making a statement.
You could respond to questions with “Sorry, I don’t know the answer to that.” Or even “Sorry, I don’t
know about ___” if you can identify some noun phrase that represents what the user is asking about.
Commands could be responded to differently. “Sorry, I don’t know how to do that.” Or if you can figure
out what they want the bot to do, you could say “Sorry, I don’t know how to ___”.
EXAMPLE QUESTIONS To get you started, here’s a list of questions – see any patterns here?
Do you know anything about Jujitsu?
What is the capital of Albania?
How did you know that?
Where is my phone?
Why won’t you answer my questions?!?!?!
You’re what kind of bot, now?
Do I really have time for this…
(Note: The question marks are obviously a useful clue about whether something is a question or not, but
users will not always type them, and speech recognition systems might not include them when they
transcribe voice to text. Make sure you create patterns that will still work when there is no
EXAMPLE COMMANDS And here’s a list of commands…
Give me info about Jujitsu.
Tell me something interesting.
Don’t say “I don’t know” again.
Go get me some useful information.
Make me a cup of coffee.
Drive me to the airport, please.
OTHER IDEAS What other things do you think a user might say to your bot? Can you use spaCy patterns to identify
more things you could respond to, or even plant some fun easter eggs for the user to find by saying
something that fits the right pattern? Feel free to implement any other ideas you may have on how to
make the bot chattier using linguistic knowledge. Have fun with it.
PHASE 3: DISCORD
Once the bot is working well in the Python shell, you should repackage it as a Discord bot and include a
link to add the bot to a server. If you want to host your Discord bot on CSUNIX or some other server, go
for it, but it’s not necessary as long as you hand in the code so that the instructor can run it themselves.
You should place all the following into a single project folder, then zip it up and hand it in on Canvas.
- A folder containing all the code and supporting files for your bot. It should be possible to run the
bot (both Discord and standalone) from this folder using Anaconda Python 3 with spaCy and the
English language models installed.
- A text file called “phase 1.txt” containing the questions and answers that you used when
developing the FAQ Bot. There should be two questions for each answer, and it should be clear
which answer goes with which questions. I will use the questions in this file when I’m testing
- A text file called “phase 2.txt”. This file should contain any special instructions needed to get the
most out of the “chattier” aspects of your bot. How should we test your bot to see all the cool
stuff you included? Describe what kinds of utterances your bot can respond to and give us some
sample utterances that show your bot behaving at its chatty best.
- A test file called “phase 3.txt”. This file should contain the link to the discord version of your bot
along with any special instructions required to talk to it (prefixes, etc.), or any other special
features you want to show off that are unique to this version of the bot.
Your project will be marked out of 20 using the following Rubric.
Category Level 4: 100% Level 3: 75% Level 2: 50% Level 1: 25%
Phase 1: FAQ Bot (4 points)
Uses regex efficiently and effectively to answer all questions identified by the developer. Use fuzzy regex efficiently to tolerate of a small number of typos.
Uses regex to answer most questions correctly. Offers useful responses to novel questions some of the time. Uses fuzzy regex to tolerate of a small number of typos.
Uses regex and/or fuzzy regex to answer some questions correctly.
Correctly answers some questions.
Phase 2: FAQ Bot Plus (4 points)
Uses linguistic pattern matching and other linguistic knowledge to respond appropriately when user intent is unknown. Exhibits a range of responses and echo back phrases from the utterance in some cases.
Uses linguistic pattern matching or other linguistic knowledge to respond appropriately when user intent is unknown. Exhibits a range of such responses.
Uses linguistic pattern matching or other linguistic knowledge to respond appropriately sometimes when user intent is unknown. Exhibits some range of such responses.
Responds appropriately sometimes when user intent is unknown. Exhibits a limited range of such responses.
Phase 3: Discord (2 points)
Bot can be added to a discord server and functions as well as the Python shell version.
Bot can be added to a discord server and functions almost as well as the Python shell version.
Bot can be added to a discord server and responds to utterances.
Bot can be added to a discord server.
Code Structure (6 points)
Highly effective and efficient use of regex, fuzzy regex, and spaCy pattern matching. Uses highly modular and well-structured code. Discord and shell versions of the bot are identical other than the interface code.
Effective use of regex, fuzzy regex, and spaCy pattern matching and/or mostly modular code, shared between the two bot versions.
Uses regex, fuzzy regex, and spaCy pattern matching and/or somewhat modular code.
Limited use of regex, fuzzy regex, and spaCy pattern matching and/or limited modular structure.
Category Level 4: 100% Level 3: 75% Level 2: 50% Level 1: 25%
External Documentation (2 points)
Phase 1, 2, and 3 text files are present and complete. Instructions and test cases are complete enough to coax the best possible behavior from the bot.
Some of phase 1, 2, and 3 text files are present and/or the instructions and test cases are somewhat complete.
Internal Documentation (2 points)
Commenting and naming conventions are consistent the course standards (based on the PEP-8 and PEP-257). All files contain a docstring with a description, author information, and links to original sources. All functions contain a docstring with description of behavior, parameters, and return values.
Commenting and naming conventions are somewhat consistent with course standards and/or docstrings are missing or incomplete for some files.
QUALITY OF RESPONSE NO RESPONSE POOR / UNSATISFACTORY SATISFACTORY GOOD EXCELLENT Content (worth a maximum of 50% of the total points) Zero points: Student failed to submit the final paper. 20 points out of 50: The essay illustrates poor understanding of the relevant material by failing to address or incorrectly addressing the relevant content; failing to identify or inaccurately explaining/defining key concepts/ideas; ignoring or incorrectly explaining key points/claims and the reasoning behind them; and/or incorrectly or inappropriately using terminology; and elements of the response are lacking. 30 points out of 50: The essay illustrates a rudimentary understanding of the relevant material by mentioning but not full explaining the relevant content; identifying some of the key concepts/ideas though failing to fully or accurately explain many of them; using terminology, though sometimes inaccurately or inappropriately; and/or incorporating some key claims/points but failing to explain the reasoning behind them or doing so inaccurately. Elements of the required response may also be lacking. 40 points out of 50: The essay illustrates solid understanding of the relevant material by correctly addressing most of the relevant content; identifying and explaining most of the key concepts/ideas; using correct terminology; explaining the reasoning behind most of the key points/claims; and/or where necessary or useful, substantiating some points with accurate examples. The answer is complete. 50 points: The essay illustrates exemplary understanding of the relevant material by thoroughly and correctly addressing the relevant content; identifying and explaining all of the key concepts/ideas; using correct terminology explaining the reasoning behind key points/claims and substantiating, as necessary/useful, points with several accurate and illuminating examples. No aspects of the required answer are missing. Use of Sources (worth a maximum of 20% of the total points). Zero points: Student failed to include citations and/or references. Or the student failed to submit a final paper. 5 out 20 points: Sources are seldom cited to support statements and/or format of citations are not recognizable as APA 6th Edition format. There are major errors in the formation of the references and citations. And/or there is a major reliance on highly questionable. The Student fails to provide an adequate synthesis of research collected for the paper. 10 out 20 points: References to scholarly sources are occasionally given; many statements seem unsubstantiated. Frequent errors in APA 6th Edition format, leaving the reader confused about the source of the information. There are significant errors of the formation in the references and citations. And/or there is a significant use of highly questionable sources. 15 out 20 points: Credible Scholarly sources are used effectively support claims and are, for the most part, clear and fairly represented. APA 6th Edition is used with only a few minor errors. There are minor errors in reference and/or citations. And/or there is some use of questionable sources. 20 points: Credible scholarly sources are used to give compelling evidence to support claims and are clearly and fairly represented. APA 6th Edition format is used accurately and consistently. The student uses above the maximum required references in the development of the assignment. Grammar (worth maximum of 20% of total points) Zero points: Student failed to submit the final paper. 5 points out of 20: The paper does not communicate ideas/points clearly due to inappropriate use of terminology and vague language; thoughts and sentences are disjointed or incomprehensible; organization lacking; and/or numerous grammatical, spelling/punctuation errors 10 points out 20: The paper is often unclear and difficult to follow due to some inappropriate terminology and/or vague language; ideas may be fragmented, wandering and/or repetitive; poor organization; and/or some grammatical, spelling, punctuation errors 15 points out of 20: The paper is mostly clear as a result of appropriate use of terminology and minimal vagueness; no tangents and no repetition; fairly good organization; almost perfect grammar, spelling, punctuation, and word usage. 20 points: The paper is clear, concise, and a pleasure to read as a result of appropriate and precise use of terminology; total coherence of thoughts and presentation and logical organization; and the essay is error free. Structure of the Paper (worth 10% of total points) Zero points: Student failed to submit the final paper. 3 points out of 10: Student needs to develop better formatting skills. The paper omits significant structural elements required for and APA 6th edition paper. Formatting of the paper has major flaws. The paper does not conform to APA 6th edition requirements whatsoever. 5 points out of 10: Appearance of final paper demonstrates the student’s limited ability to format the paper. There are significant errors in formatting and/or the total omission of major components of an APA 6th edition paper. They can include the omission of the cover page, abstract, and page numbers. Additionally the page has major formatting issues with spacing or paragraph formation. Font size might not conform to size requirements. The student also significantly writes too large or too short of and paper 7 points out of 10: Research paper presents an above-average use of formatting skills. The paper has slight errors within the paper. This can include small errors or omissions with the cover page, abstract, page number, and headers. There could be also slight formatting issues with the document spacing or the font Additionally the paper might slightly exceed or undershoot the specific number of required written pages for the assignment. 10 points: Student provides a high-caliber, formatted paper. This includes an APA 6th edition cover page, abstract, page number, headers and is double spaced in 12’ Times Roman Font. Additionally, the paper conforms to the specific number of required written pages and neither goes over or under the specified length of the paper.
GET THIS PROJECT NOW BY CLICKING ON THIS LINK TO PLACE THE ORDER
CLICK ON THE LINK HERE: https://phdwriters.us/orders/ordernow
Also, you can place the order at www.collegepaper.us/orders/ordernow / www.phdwriters.us/orders/ordernow
Do You Have Any Other Essay/Assignment/Class Project/Homework Related to this? Click Here Now [CLICK ME] and Have It Done by Our PhD Qualified Writers!!