## Exercises

1. Write a function called findMister() that, when given any string, will return a character vector of the words that immediately follow the string “Mister,” with exactly one space in between. The function should take a single argument called str, the string to search. A typical example of use is as follows:

text <- "Here are Mister Tom, MisterJerry, Mister Mister, and Mister\tJoe."
findMister(text)
## [1] "Tom"    "Mister"
2. Write a function called findMr() that, when given any string, will return a character vector of all words following the string “Mr.” with exactly one space in between. The function should take a single argument called str, the string to search. A typical example of use is as follows:

text <- "Here are Mr. Tom, Mr Jerry, Mr. Mister, and Mr.\tJoe."
findMr(text)
## [1] "Tom"    "Mister"
3. For each of the following expressions, write a regular expression to test whether any of the sub-string(s) described occur in a given string. The regular expression should match any of the sub-strings described, and should not match any other sub-string. Try to make the regular expression as short as possible. Write the regular expression as a string that could be used in one of R’s regex functions (i.e. extra backslash escapes as needed). The first item is done for you, as an example.

• bot and bat. Regex string: "b[oa]t". (This is the one to submit, because it’s shorter than other alternatives such as"box|bat").
• cart and cars and carp.
• slick and sick
• Any word ending in ity (such as velocity and ferocity). Be sure to pay attention to word-boundaries. You should match velocity but not  velocity (includes a space before the “v”) or velocity;.
• A whole number consisting of more than six digits.
• A word that is between 3 and 6 characters long. Pay attention to word-boundaries.
• One or more white-space characters, followed by a hyphen or a semicolon or a colon.
4. Write a function called findTitled() that, when given any string, will return a character vector of all words following any one of these titles:

• “Mr.”
• “Mister”
• “Missus”
• “Mrs.”
• “Miss”
• “Ms.”

There should be exactly one space between the title and the following word. The function should take a single argument called str, the string to search. A typical example of use is as follows:

text <- "Here are Mr. Tom, Ms. Thatcher, Miss Ellen, and Helen."
findTitled(text)
## [1] "Tom"      "Thatcher" "Ellen"
5. Write a function called capRepeats() that, when given a string, searches for all repeated-word pairs (with at least one character of white-space in between) and replaces them with the same pair where all letters are capitalized. The function should take a single argument called str, the string to be searched. A typical example of use would be as follows:

capRepeats("I have a boo boo on my knee    \tknee!")
## [1] "I have a BOO BOO on my KNEE    \tKNEE!"
6. Use str_subset() to write a function called longWord() that, when given a character vector of strings, returns a vector consisting of the strings that contain a word at least eight characters long. The function should take a single argument called strs. An example of use would be:

myText <- c("Very short words.", "Got a gargantuan word.", "More short words!")
longWord(strs = myText)
## [1] "Got a gargantuan word."
7. Write a function called longWord2() that, when given a character vector of strings, returns a list of character vectors, where each vector consists of the words in the corresponding string that are at least eight characters long. The function should take a single argument called strs. An example of use would be:

myText <- c("Very short words.", "Got a gargantuan word.", "More short words!")
longWord2(strs = myText)
## [[1]]
## character(0)
##
## [[2]]
## [1] "gargantuan"
##
## [[3]]
## character(0)
8. Write a function called phoneNumber() that, when given a vector of strings returns a logical vector indicating which of the strings contain a valid phone number. For our purposes a valid phone number shall be any string of the form

xxx-xxx-xxxx

or

xxx.xxx.xxx

Thus, 502-863-8111 is valid and so is 502.863.8111, but not 502-863.8111.

In the code for the function, specify the pattern using (?x) so you can ignore whitespace and leave detailed comments for each portion of the regular expression.

The function should take a single parameter called strs. A typical example of use would be:

sentences <- c("Ted's number is 606-255-3143.",
"Rhonda's number is 403-28-1259.",
"Lydia's number is 502.255.3921.",
"Raj's number is 502.367-4432.")
phoneNumber(strs = sentences)
## [1]  TRUE FALSE  TRUE FALSE