Previous: Design count-words-example, Up: count-words-example [Contents][Index]
count-words-example
The count-words-example
command described in the preceding section
has two bugs, or rather, one bug with two manifestations. First, if you
mark a region containing only whitespace in the middle of some text, the
count-words-example
command tells you that the region contains one
word! Second, if you mark a region containing only whitespace at the end of
the buffer or the accessible portion of a narrowed buffer, the command
displays an error message that looks like this:
Search failed: "\\w+\\W*"
If you are reading this in Info in GNU Emacs, you can test for these bugs yourself.
First, evaluate the function in the usual manner to install it.
If you wish, you can also install this keybinding by evaluating it:
(global-set-key "\C-c=" 'count-words-example)
To conduct the first test, set mark and point to the beginning and end of the following line and then type C-c = (or M-x count-words-example if you have not bound C-c =):
one two three
Emacs will tell you, correctly, that the region has three words.
Repeat the test, but place mark at the beginning of the line and place point just before the word ‘one’. Again type the command C-c = (or M-x count-words-example). Emacs should tell you that the region has no words, since it is composed only of the whitespace at the beginning of the line. But instead Emacs tells you that the region has one word!
For the third test, copy the sample line to the end of the *scratch* buffer and then type several spaces at the end of the line. Place mark right after the word ‘three’ and point at the end of line. (The end of the line will be the end of the buffer.) Type C-c = (or M-x count-words-example) as you did before. Again, Emacs should tell you that the region has no words, since it is composed only of the whitespace at the end of the line. Instead, Emacs displays an error message saying ‘Search failed’.
The two bugs stem from the same problem.
Consider the first manifestation of the bug, in which the command tells you
that the whitespace at the beginning of the line contains one word. What
happens is this: The M-x count-words-example
command moves point to
the beginning of the region. The while
tests whether the value of
point is smaller than the value of end
, which it is. Consequently,
the regular expression search looks for and finds the first word. It leaves
point after the word. count
is set to one. The while
loop
repeats; but this time the value of point is larger than the value of
end
, the loop is exited; and the function displays a message saying
the number of words in the region is one. In brief, the regular expression
search looks for and finds the word even though it is outside the marked
region.
In the second manifestation of the bug, the region is whitespace at the end
of the buffer. Emacs says ‘Search failed’. What happens is that the
true-or-false-test in the while
loop tests true, so the search
expression is executed. But since there are no more words in the buffer,
the search fails.
In both manifestations of the bug, the search extends or attempts to extend outside of the region.
The solution is to limit the search to the region—this is a fairly simple action, but as you may have come to expect, it is not quite as simple as you might think.
As we have seen, the re-search-forward
function takes a search
pattern as its first argument. But in addition to this first, mandatory
argument, it accepts three optional arguments. The optional second argument
bounds the search. The optional third argument, if t
, causes the
function to return nil
rather than signal an error if the search
fails. The optional fourth argument is a repeat count. (In Emacs, you can
see a function’s documentation by typing C-h f, the name of the
function, and then RET.)
In the count-words-example
definition, the value of the end of the
region is held by the variable end
which is passed as an argument to
the function. Thus, we can add end
as an argument to the regular
expression search expression:
(re-search-forward "\\w+\\W*" end)
However, if you make only this change to the count-words-example
definition and then test the new version of the definition on a stretch of
whitespace, you will receive an error message saying ‘Search failed’.
What happens is this: the search is limited to the region, and fails as you expect because there are no word-constituent characters in the region. Since it fails, we receive an error message. But we do not want to receive an error message in this case; we want to receive the message “The region does NOT have any words.”
The solution to this problem is to provide re-search-forward
with a
third argument of t
, which causes the function to return nil
rather than signal an error if the search fails.
However, if you make this change and try it, you will see the message
“Counting words in region ... ” and … you will keep on seeing that
message …, until you type C-g (keyboard-quit
).
Here is what happens: the search is limited to the region, as before, and it
fails because there are no word-constituent characters in the region, as
expected. Consequently, the re-search-forward
expression returns
nil
. It does nothing else. In particular, it does not move point,
which it does as a side effect if it finds the search target. After the
re-search-forward
expression returns nil
, the next expression
in the while
loop is evaluated. This expression increments the
count. Then the loop repeats. The true-or-false-test tests true because
the value of point is still less than the value of end, since the
re-search-forward
expression did not move point. … and the
cycle repeats …
The count-words-example
definition requires yet another modification,
to cause the true-or-false-test of the while
loop to test false if
the search fails. Put another way, there are two conditions that must be
satisfied in the true-or-false-test before the word count variable is
incremented: point must still be within the region and the search expression
must have found a word to count.
Since both the first condition and the second condition must be true
together, the two expressions, the region test and the search expression,
can be joined with an and
special form and embedded in the
while
loop as the true-or-false-test, like this:
(and (< (point) end) (re-search-forward "\\w+\\W*" end t))
The re-search-forward
expression returns t
if the search
succeeds and as a side effect moves point. Consequently, as words are
found, point is moved through the region. When the search expression fails
to find another word, or when point reaches the end of the region, the
true-or-false-test tests false, the while
loop exits, and the
count-words-example
function displays one or other of its messages.
After incorporating these final changes, the count-words-example
works without bugs (or at least, without bugs that I have found!). Here is
what it looks like:
;;; Final version: while
(defun count-words-example (beginning end)
"Print number of words in the region."
(interactive "r")
(message "Counting words in region ... ")
;;; 1. Set up appropriate conditions.
(save-excursion
(let ((count 0))
(goto-char beginning)
;;; 2. Run the while loop. (while (and (< (point) end) (re-search-forward "\\w+\\W*" end t)) (setq count (1+ count)))
;;; 3. Send a message to the user.
(cond ((zerop count)
(message
"The region does NOT have any words."))
((= 1 count)
(message
"The region has 1 word."))
(t
(message
"The region has %d words." count))))))
Previous: Design count-words-example, Up: count-words-example [Contents][Index]