Exercises
Write a function called
findMister()
that, when given any string, will return a character vector of the words that immediately follow the string “Mister,” with exactly one space in between. The function should take a single argument calledstr
, the string to search. A typical example of use is as follows:<- "Here are Mister Tom, MisterJerry, Mister Mister, and Mister\tJoe." text findMister(text)
## [1] "Tom" "Mister"
Write a function called
findMr()
that, when given any string, will return a character vector of all words following the string “Mr.” with exactly one space in between. The function should take a single argument calledstr
, the string to search. A typical example of use is as follows:<- "Here are Mr. Tom, Mr Jerry, Mr. Mister, and Mr.\tJoe." text findMr(text)
## [1] "Tom" "Mister"
For each of the following expressions, write a regular expression to test whether any of the sub-string(s) described occur in a given string. The regular expression should match any of the sub-strings described, and should not match any other sub-string. Try to make the regular expression as short as possible. Write the regular expression as a string that could be used in one of R’s regex functions (i.e. extra backslash escapes as needed). The first item is done for you, as an example.
- bot and bat. Regex string:
"b[oa]t"
. (This is the one to submit, because it’s shorter than other alternatives such as"box|bat"
). - cart and cars and carp.
- slick and sick
- Any word ending in ity (such as velocity and ferocity). Be sure to pay attention to word-boundaries. You should match velocity but not velocity (includes a space before the “v”) or velocity;.
- A whole number consisting of more than six digits.
- A word that is between 3 and 6 characters long. Pay attention to word-boundaries.
- One or more white-space characters, followed by a hyphen or a semicolon or a colon.
- bot and bat. Regex string:
Write a function called
findTitled()
that, when given any string, will return a character vector of all words following any one of these titles:- “Mr.”
- “Mister”
- “Missus”
- “Mrs.”
- “Miss”
- “Ms.”
There should be exactly one space between the title and the following word. The function should take a single argument called
str
, the string to search. A typical example of use is as follows:<- "Here are Mr. Tom, Ms. Thatcher, Miss Ellen, and Helen." text findTitled(text)
## [1] "Tom" "Thatcher" "Ellen"
Write a function called
capRepeats()
that, when given a string, searches for all repeated-word pairs (with at least one character of white-space in between) and replaces them with the same pair where all letters are capitalized. The function should take a single argument calledstr
, the string to be searched. A typical example of use would be as follows:capRepeats("I have a boo boo on my knee \tknee!")
## [1] "I have a BOO BOO on my KNEE \tKNEE!"
Use
str_subset()
to write a function calledlongWord()
that, when given a character vector of strings, returns a vector consisting of the strings that contain a word at least eight characters long. The function should take a single argument calledstrs
. An example of use would be:<- c("Very short words.", "Got a gargantuan word.", "More short words!") myText longWord(strs = myText)
## [1] "Got a gargantuan word."
Write a function called
longWord2()
that, when given a character vector of strings, returns a list of character vectors, where each vector consists of the words in the corresponding string that are at least eight characters long. The function should take a single argument calledstrs
. An example of use would be:<- c("Very short words.", "Got a gargantuan word.", "More short words!") myText longWord2(strs = myText)
## [[1]] ## character(0) ## ## [[2]] ## [1] "gargantuan" ## ## [[3]] ## character(0)
Write a function called
phoneNumber()
that, when given a vector of strings returns a logical vector indicating which of the strings contain a valid phone number. For our purposes a valid phone number shall be any string of the formxxx-xxx-xxxx
or
xxx.xxx.xxx
Thus, 502-863-8111 is valid and so is 502.863.8111, but not 502-863.8111.
In the code for the function, specify the pattern using
(?x)
so you can ignore whitespace and leave detailed comments for each portion of the regular expression.The function should take a single parameter called
strs
. A typical example of use would be:<- c("Ted's number is 606-255-3143.", sentences "Rhonda's number is 403-28-1259.", "Lydia's number is 502.255.3921.", "Raj's number is 502.367-4432.") phoneNumber(strs = sentences)
## [1] TRUE FALSE TRUE FALSE