Online Symposium: Voluntary Collective Licensing of Music

April 8th, 2008 by Ed Felten

Today we’re kicking off an online symposium on voluntary collective licensing of music, over at the Center for InfoTech Policy site.

The symposium is motivated by recent movement in the music industry toward the possibility of licensing large music catalogs to consumers for a fixed monthly fee. For example, Warner Music, one of the major record companies, just hired Jim Griffin to explore such a system, in which Internet Service Providers would pay a per-user fee to record companies in exchange for allowing the ISPs’ customers to access music freely online. The industry had previously opposed collective licenses, making them politically non-viable, but the policy logjam may be about to break, making this a perfect time to discuss the pros and cons of various policy options.

It’s an issue that evokes strong feelings — just look at the comments on David’s recent post.

We have a strong group of panelists:

  • Matt Earp is a graduate student in the i-school at UC Berkeley, studying the design and implementation of voluntary collective licensing systems.
  • Ari Feldman is a Ph.D. candidate in computer science at Princeton, studying computer security and information policy.
  • Ed Felten is a Professor of Computer Science and Public Affairs at Princeton.
  • Jon Healey is an editorial writer at the Los Angeles Times and writes the paper’s Bit Player blog, which focuses on how technology is changing the entertainment industry’s business models.
  • Samantha Murphy is an independent singer/songwriter and Founder of SMtvMusic.com.
  • David Robinson is Associate Director of the Center for InfoTech Policy at Princeton.
  • Fred von Lohmann is a Senior Staff Attorney at the Electronic Frontier Foundation, specializing in intellectual property matters.
  • Harlan Yu is a Ph.D. candidate in computer science at Princeton, working at the intersection of computer science and public policy.

Check it out!

Phorm’s Harms Extend Beyond Privacy

April 7th, 2008 by Harlan Yu

Last week, I wrote about the privacy concerns surrounding Phorm, an online advertising company who has teamed up with British ISPs to track user Web behavior from within their networks. New technical details about its Webwise system have since emerged, and it’s not just privacy that now seems to be at risk. The report exposes a system that actively degrades user experience and alters the interaction with content providers. Even more importantly, the Webwise system is a clear violation of the sacred end-to-end principle that guides the core architectural design of the Internet.

Phorm’s system does more than just passively gain “access to customers’ browsing records” as previously suggested. Instead, they plan on installing a network switch at each participating ISP that actively interferes with the user’s browsing session by injecting multiple URL redirections before the user can retrieve the requested content. Sparing you most of the nitty-gritty technical details, the switch intercepts the initial HTTP request to the content server to check whether a Webwise cookie–containing the user’s randomly-assigned identifier (UID)– exists in the browser. It then impersonates the requested server to trick the browser into accepting a spoofed cookie (which I will explain later) that contains the same UID. Only then will the switch forward the request and return the actual content to the user. Basically, this amounts to a big technical hack by Phorm to set the cookies that track users as they browse the Web.

In all, a user’s initial request is redirected three times for each domain that is contacted. Though this may not seem like much, this extra layer of indirection harms the user by degrading the overall browsing experience. It imposes an unnecessary delay that will likely be noticeable by users.

The spoofed cookie that Phorm stores on the user’s browser during this process is also a highly questionable practice. Generally speaking, a cookie is specific to a particular domain and the browser typically ensures that a cookie can only be read and written by the domain it belongs to. For example, data in a yahoo.com cookie is only sent when you contact a yahoo.com server, and only a yahoo.com server can put data into that cookie.

But since Phorm controls the switch at the ISP, it can bypass this usual guarantee by impersonating the server to add cookies for other domains. To continue the example, the switch (1) intercepts the user’s request, (2) pretends to be a yahoo.com server, and (3) injects a new yahoo.com cookie that contains the Phorm UID. The browser, believing the cookie to actually be from yahoo.com, happily accepts and stores it. This cookie is used later by Phorm to identify the user whenever the user visits any page on yahoo.com.

Cookie spoofing is problematic because it can change the interaction between the user and the content-providing site. Suppose a site’s privacy policy promises the user that it does not use tracking cookies. But because of Phorm’s spoofing, the browser will store a cookie that (to the user) looks exactly like a tracking cookie from the site. Now, the switch typically strips out this tracking cookie before it reaches the site, but if the user moves to a non-Phorm ISP (say at work), the cookie will actually reach the site in violation of its stated privacy policy. The cookie can also cause other problems, such as a cookie collision if the site cookie inadvertently has the same name as the Phorm cookie.

Disruptive activities inside the network often create these sort of unexpected problems for both users and websites, which is why computer scientists are skeptical of ideas that violate the end-to-end principle. For the uninitiated, the principle, in short, states that system functionality should almost always be implemented at the end hosts of the network, with a few justifiable exceptions. For instance, almost all security functionality (such as data encryption and decryption) is done by end users and only rarely by machines inside the network.

The Webwise system has no business being inside the network and has no role in transporting packets from one end of the network to the other. The technical Internet community has been worried for years about the slow erosion of the end-to-end principle, particularly by ISPs who are looking to further monetize their networks. This principle is the one upon which the Internet is built and one which the ISPs must uphold. Phorm’s system, nearly in production, is a cogent realization of this erosion, and ISPs should keep Phorm outside the gate.

NJ Election Discrepancies Worse Than Previously Thought, Contradict Sequoia’s Explanation

April 4th, 2008 by Ed Felten

I wrote previously about discrepancies in the vote totals reported by Sequoia AVC Advantage voting machines in New Jersey’s presidential primary election, and the incomplete explanation offered by Sequoia, the voting machine vendor. I published copies of the “summary tapes” printed by nine voting machines in Union County that showed discrepancies; all of them were consistent with Sequoia’s explanation of what went wrong.

This week we obtained six new summary tapes, from machines in Bergen and Gloucester counties. Two of these new tapes contradict Sequoia’s explanation and show more serious discrepancies that we saw before.

Before we dig into the details, let’s review some background. At the end of Election Day, each Sequoia AVC Advantage voting machine prints a “summary tape” (or “results report”) that lists (among other things) the number of votes cast for each candidate on that machine, and the total voter turnout (number of votes cast) in each party. In the Super Tuesday primary, a few dozen machines in New Jersey showed discrepancies in which the number of votes recorded for candidates in one party exceeded the voter turnout in that party. For example, the vote totals section of a tape might show 61 total votes for Republican candidates, while the turnout section of the same tape shows only 60 Republican voters.

Sequoia’s explanation was that in certain circumstances, a voter would be allowed to vote in one party while being recorded in the other party’s turnout. (”It has been observed that the ‘Option Switch’ or Party Turnout Totals section of the Results Report may be misreported whereby turnout associated with the party or option switch choice is misallocated. In every instance, however, the total turnout, or the sum of the turnout allocation, is accurate.”) Sequoia’s memo points to a technical flaw that might cause this kind of misallocation.

The nine summary tapes I had previously were all consistent with Sequoia’s explanation. Though the total votes exceeded the turnout in one party, the votes were less than the turnout in the other party, so that the discrepancy could have been caused by misallocating turnout as Sequoia described. For example, a tape from Hillside showed 61 Republican votes cast by 60 voters, and 361 Democratic votes cast by 362 voters, for a total of 422 votes cast by 422 voters. Based on these nine tapes, Sequoia’s explanation, though incomplete, could have been correct.

But look at one of the new tapes, from Englewood Cliffs, District 4, in Bergen County. Here’s a relevant part of the tape:

The Republican vote totals are Giuliani 1, Paul 1, Romney 6, McCain 14, for a total of 22. The Democratic totals are Obama 33, Edwards 2, Clinton 49, for a total of 84. That comes to 106 total votes across the two parties.

The turnout section (or “Option Switch Totals”) shows 22 Republican voters and 83 Democratic voters, for a total of 105.

This is not only wrong — 106 votes cast by 105 voters — but it’s also inconsistent with Sequoia’s explanation. Sequoia says that all of the voters show up in the turnout section, but a few might show up in the wrong party’s turnout. (”In every instance, however, the total turnout, or the sum of the turnout allocation, is accurate.”) That’s not what we see here, so Sequoia’s explanation must be incorrect.

And that’s not all. Each machine has a “public counter” that keeps track of how many votes were cast on the machine in the current election. The public counter, which is found on virtually all voting machines, is one of the important safeguards ensuring that votes are not cast improperly. Here’s the top of the same tape, showing the public counter as 105.

The public counter is important enough that the poll workers actually sign a statement at the bottom of the tape, attesting to the value of the public counter. Here’s the signed statement from the same tape:

The public counter says 105, even though 106 votes were reported. That’s a big problem.

Another of the new tapes, this one from West Deptford in Gloucester County, shows a similar discrepancy, with 167 total votes, a total turnout of 166, and public counter showing 166.

How many more New Jersey tapes show errors? What’s wrong with Sequoia’s explanation? What really happened? We don’t know the answers to any of these questions.

Isn’t it time for a truly independent investigation?

Bad Phorm on Privacy

April 3rd, 2008 by Harlan Yu

Phorm, an online advertising company, has recently made deals with several British ISPs to gain unprecedented access to every single Web action taken by their customers. The deals will let Phorm track search terms, URLs and other keywords to create online behavior profiles of individual customers, which will then be used to provide better targeted ads. The company claims that “No private or personal information, or anything that can identify you, is ever stored - and that means your privacy is never at risk.” Although Phorm might have honest intentions, their privacy claims are, at best, misleading to customers.

Their privacy promise is that personally-identifiable information is never stored, but they make no promises on how the raw logs of search terms and URLs are used before they are deleted. It’s clear from Phorm’s online literature that they use this sensitive data for ad delivery purposes. In one example, they claim advertisers will be able to target ads directly to users who see the keywords “Paris vacation” either as a search or within the text of a visited webpage. Without even getting to the storage question, users will likely perceive Phorm’s access and use of their behavioral data as a compromise of their personal privacy.

What Phorm does store permanently are two pieces of information about each user: (1) the “advertising categories” that the user is interested in and (2) a randomly-generated ID from the user’s browser cookie. Each raw online action is sorted into one or more categories, such as “travel” or “luxury cars”, that are defined by advertisers. The privacy worry is that as these categories become more specific, the behavioral profiles of each user becomes ever more precise. Phorm seems to impose no limit on the specificity of these defined categories, so for all intents and purposes, these categories over time will become nearly identical to the search terms themselves. Indeed, they market their “finely tuned” service as analogous to typical keyword search campaigns that advertisers are already used to. Phorm has a strong incentive to store arbitrarily specific interest categories about each user to provide optimally targeted ads, and thus boost the profits of their advertising business.

The second protection mechanism is a randomly-generated ID number stored in a browser cookie that Phorm uses to “anonymously” track a user as she browses the web. This ID number is stored with the list of the interest categories collected for that user. Phorm should be given credit for recognizing this as more privacy-protecting than simply using the customer’s name or IP address as an identifier (something even Google has disappointingly failed to recognize). But from past experience, these protections are unlikely to be enough. The storage of random user IDs mapped to keywords mirroring actual search queries is highly reminiscent of the AOL data fiasco from 2006, where AOL released “anonymized” search histories containing 20 million keywords. It turned out to be easy to identify the name of specific individuals based solely on their search history.

In the least, the company’s employees will be able to access an AOL-like dataset about the ISP’s customers. Granted, distinguishing whether particular datasets as personally-identifiable or not is a notoriously difficult problem and subject to further research. But it’s inaccurate for Phorm to claim that personally-identifiable information is not being stored and to promise users that their privacy is not at risk.

Music Industry Under Fire for Exploring EFF Suggestion

April 2nd, 2008 by David Robinson

Jim Griffin, a music industry consultant who is in the unusual position of being recognized as smart and reasonable by participants across a broad swath of positions in the copyright debate, revealed last week that he’s working to start a new music industry organization that will urge ISPs to bundle a music licensing fee into their monthly service costs, in exchange for which the major labels will agree not to sue (and, presumably, not to threaten suit against) the ISP’s customers for copyright infringement of the music whose rights they own. The goal, Griffin says, is to “monetize the anarchy of the Internet.”

This idea has a long history and has at various times been propounded by some on the “copyleft.” The Electronic Frontier Foundation, for example, issued in April 2004 a report entitled “A Better Way Forward: Voluntary Collective Licensing of Music File Sharing“. This report even suggested the $5 per user per month ($60 per user per year) that Griffin apparently has in mind.

According to the OECD, there were roughly 60 million broadband subscriptions in the United States as of the end of 2006. If each of these were to pay $60 a year, the total would be $3.6 billion a year. I know that broadband uptake is increasing, but I remain unsure how Griffin figures that the proposed system “could create a pool as large as $20 billion a year.” Perhaps this imagines global, rather than national, uptake of the plan? If so, it seems to embody some optimistic assumptions about how widely any such agreement could plausibly be extended.

Some prominent blogs have reacted with ire—Michael Arrington at TechCrunch, for example, characterizes the move as an “extortion scheme.” Arrington argues that a licensing system will hinder innovation because the revenues from it will be constant irrespective of the amount or quality of music published by the labels, and will flow to an infrastructure that, once it begins to be subsidized, will have little structural incentive to innovate. He also argues in a later post that since the core of the system is a covenant not to sue, it represents a “protection racket.”

I think this kind of skepticism is poorly justified at this point. If the labels can turn their statutory right to sue for damages after copyright infringement into a voluntary system where they get paid and nobody gets sued, it strikes me as a case of the system working. And the numbers matter: The idea of a $20 billion payoff that would triple the industry’s current $10 billion in annual revenue does not seem reasonable, but unless I am missing something it also does not seem probable.

There are two core questions for the plan. First, what will it cover? The idea is that it will let the industry stop suing, and thereby end the antagonism between labels and customers. But unless a critical mass of the labels agree to the plan, users whose ISPs are paying in will still face the risk of suit from non-participating copyright holders. In fact, if the plan takes off, individual rights holders may face an incentive to defect, since consumers are equally likely to infringe all popular music regardless of which music happens to be covered by the plan (since they aren’t likely to track which music is covered).

Second, how will the revenue be shared? Filesharing metrics, provided by analysts like BigChampagne, are at best approximate, and they only track downloads that occur via the public, unencrypted Internet–presumably a large share of the relevant copying, but not all of it, especially in the context of University and other networks. The squabbles will be challenging, and if past is prologue, then the labels may not prove themselves an amicable bunch in negotiating with each other.

Finally, it’s important to remember that the labels’ power depends, in the very long run, on their ability to sign the best new talent. If the licensing system proposed by Griffin takes off, it may preserve the status quo for now. But if the industry continues to give artists themselves a raw deal, as it is so often accused of doing, artists will still have the growing power that digital technology gives them to share their music without a label’s help.

An Inconvenient Truth About Privacy

April 1st, 2008 by Ed Felten

One of the lessons we’ve learned from Al Gore is that it’s possible to have too much of a good thing. We all like to tool around in our SUVs, but too much driving leads to global warning. We must all take responsibility for our own carbon emissions.

The same goes for online privacy, except that there the problem is storage rather than carbon emissions. We all want more and bigger hard drives, but what is going to be stored on those drives? Information, probably relating to other people. The equation is simple: more storage equals more privacy invasion.

That’s why I have pledged to maintain a storage-neutral lifestyle. From now on, whenever I buy a new hard drive, I’ll either delete the same amount of old information, or I’ll purchase a storage offset from someone else who has extra data to delete. By bidding up the cost of storage offsets, I’ll help create a market for storage conservation, without the inconvenience of changing my storage-intensive lifestyle.

Government can do its part, too. If the U.S. government adopted a storage-neutral policy, then for every email the NSA recorded, the government would have to delete another email elsewhere — say, at the White House. It’s truly a win-win outcome. And storage conservation technology can help drive the green economy of the twenty-first century.

For private industry, a cap-and-trade system is the best policy. Companies will receive data storage permits, which can be bought and sold freely. When JuicyCampus conserves storage by eliminating its access logs, it can sell the unused storage capacity to ChoicePoint, perhaps for storing information about the same JuicyCampus posters. The free market will allocate the limited storage capacity efficiently, as those who profit by storing less can sell permits to those who profit by storing more.

Debating these policy niceties is all well and good, but the important thing is for all of us to recognize the storage problem and make changes in our own lives. If you and I don’t reduce our storage footprint, who will?

Please join me today in adopting a storage-neutral lifestyle. You can start by not leaving comments on this post.

Comcast and BitTorrent: Why You Can’t Negotiate with a Protocol

March 28th, 2008 by Ed Felten

The big tech policy news yesterday was Comcast’s announcement that it will stop impeding BitTorrent traffic, but instead will respond to network congestion by slowing traffic from the highest-volume users, regardless of what those users are doing. Comcast also announced a deal with BitTorrent, aimed at developing more effective ways of channeling peer-to-peer traffic through networks.

It may seem natural to respond to a network issue involving BitTorrent by making a deal with BitTorrent — and much of the reporting and commentary has taken that line — but there is something odd about the BitTorrent deal, which only becomes clear when we unpack the difference between the BitTorrent protocol and the BitTorrent company. The BitTorrent protocol is a set of technical rules used by desktop software programs to coordinate the peer-to-peer distribution of files. The company BitTorrent Inc. is just one maker of software that uses the protocol — indeed, it’s a relatively minor player in that market. Most people who use the BitTorrent protocol don’t use software from BitTorrent Inc.

What this means is that changes in BitTorrent Inc’s products won’t have much effect on Comcast’s network. What Comcast needs, if it wants to change conditions in its network, is to change the BitTorrent protocol.

The problem is that you can’t negotiate with a protocol, for the same reason that you can’t negotiate with (say) the English language. You can use the language to negotiate with someone, but you can’t have a negotiation where the other party is the language. You can negotiate with the Queen of England, or English Department at Princeton, or the people who publish the most popular dictionary. But the language itself just isn’t the kind of entity that can make an agreement or have an intention.

This property of protocols — that you can’t get a meeting with them, convince them to change their behavior, or make a deal with them — seems especially challenging to some Washington policymakers. If, as they do, you live in a world driven by meetings and deal-making, a world where problem-solving means convincing someone to change something, then it’s natural to think that every protocol, and every piece of technology, must be owned and managed by some entity.

Engineers sometimes make a similar mistake in thinking about technology markets. We like to think that technologies are designed by engineers, but often it’s more accurate to say that some technology was designed by a market. And where the market is in charge, there is nobody to call when the technology needs to be changed.

Will Comcast and BitTorrent Inc. succeed in improving the BitTorrent protocol? Maybe. But it won’t be enough simply to have a better protocol. They’ll also have to convince the population of BitTorrent users to switch.

UPDATE (April 2): A reader points out that BitTorrent Inc bought uTorrent, one of the popular client programs implementing the BitTorrent protocol. This means that BitTorrent Inc has more leverage to force adoption of new protocol versions than I had thought. Still, I stand by the basic point of the post, that BitTorrent Inc doesn’t have unilateral power to change the protocol.

California review of the ES&S AutoMARK and M100

March 26th, 2008 by Dan Wallach

California’s Secretary of State has been busy. It appears that ES&S (manufacturers of the Ink-a-Vote voting system, used in Los Angeles, as well as the iVotronic systems that made news in Sarasota, Florida in 2006) submitted its latest and greatest “Unity 3.0.1.1″ system for California certification. ES&S systems were also considered by Ohio’s study last year, which found a variety of security problems.

California already analyzed the Ink-a-Vote. This time, ES&S submitted their AutoMARK ballot marking device, which has generated some prior fame for being more accessible than other electronic voting solutions, as well has having generated some prior infamy for having gone through various hardware changes without being resubmitted for certification. ES&S also submitted its M100 precinct-based tabulation systems, which would work in conjunction with the AutoMARK devices. (Most voters would vote with pen on a bubble sheet. The AutoMARK presents a fancy computer interface but really does nothing more than mark the bubble sheet on behalf of the voter.) ES&S apparently did not submit its iVotronic systems.

The results? Certification denied.

Let’s start with the letter from the Secretary to the vendor and work our way down.

ES&S failed to submit “California Use Procedures” to address issues that they were notified about back in December as part of their conditional certification of an earlier version of the system. This can only be interpreted as vendor incompetence. Here’s a choice quote:

ES&S submitted what it stated were its revised, completed California Use Procedures on March 4th. Staff spent several days reviewing the document, which is several hundred pages in length. Staff found revisions expressly called for in the testing reports, but found that none of the changes promised two months earlier in Mr. Groh’ s letter of January 11, 2008, were included.

The accessibility report is very well done and should be required reading for anybody wanting to understand accessibility issues from a broad perspective. They found:

  • Physical access has some limitations.
  • There are some personal safety hazards.
  • Voters with severe manual dexterity impairments may not be able to independently remove the ballot from the AutoMARK and cast it.
  • The keypad controls present challenges for some voters.
  • It takes more time to vote with the audio interface.
  • The audio ballot navigation can be confusing.
  • Write-in difficulties frustrated some voters.
  • The voting accuracy was limited by write-in failures.
  • Many of the spoken instructions and prompts are inadequate.
  • The system lacks support for good public hygiene.
  • There were some reliability concerns.
  • The vendor’s pollworker training and materials need improvement.

Yet still, they note that “We are not aware of any public device that has more flexibility in accommodating the wide range of physical and dexterity abilities that voters may have. The key, as always, is whether pollworkers and voters will be able to identify and implement the optimal input system without better guidance or expert support. In fact, it may be that the more flexible a system is, the more difficult it is for novices to navigate through the necessary choices for configuring the access options in order to arrive at the best solution.” One of their most striking findings was how long it took test subjects to use the system. Audio-only voters needed an average of almost 18 minutes to use the machine on a simplified ballot (minimum 10 minutes; maximum 35 minutes). Write-in votes were exceptionally difficult. And, again, this is arguably one of the best voting systems available, at least from an accessibility perspective.

Okay, you were all waiting to learn more about the security problems. Let’s go. The “red team” exercise was performed by the Freeman Craft McGregor Group. It’s a bit skimpy and superficial. Nonetheless, they say:

  • You can swap out the PCMCIA memory cards in the precinct-based ballot tabulator (model M100), while in the precinct. This attack would be unlikely to be detected.
  • There’s no cryptography of any kind protecting the data written to the PCMCIA cards. If an attacker can learn the file format (which isn’t very hard to do), and can get physical access to the card while in transit or storage, then the attacker can trivially substitute alternative vote records.
  • The back-end “Election Reporting Manager” has a feature to add or remove votes from the vote totals. This would be visible in the audit logs, if anybody bothered to look at them, but these sorts of logs aren’t typically produced to the public. (Hart InterCivic has a very similar “Adjust Vote Totals” feature with similar vulnerabilities.)
  • The high speed central ballot tabulator (the M650) writes its results to a Zip disk, again with no cryptography or anything else to protect the data in transit.
  • The database in which audit records are kept has a password that can be “cracked” (we’re not told how). Once you’re into the database, you can create new accounts, delete old audit records, and otherwise cause arbitrary mayhem.
  • Generally speaking, a few minutes of physical access is all you need to compromise any of the back-end tools.
  • All of the physical key locks could be picked in “five seconds to one minute.” The wire and paper-sticker tamper-evidence seals could also be easily bypassed.

And then there’s the source code analysis, prepared by atsec (who normally make a living doing Common Criteria analyses). Again, the public report is less detailed than it can and should be (and we have no idea how much more is in the private report). Where should we begin?

The developer did not provide detailed build instructions that would explain how the system is constructed from the source code. Among the missing aspects were details about versions of compilers, build environment and preconditions, and ordering requirements.

This was one of our big annoyances when working on California’s original top-to-bottom review last summer. It’s fantastically helpful to be able to compile the program. You need to do that if you want to run various automated tools that might check for bugs. Likewise, there’s no substitute for being able to add debugging print statements and use other debugging techniques when you want to really understand how something works. Vendors should be required to provide not just source code but everything necessary to produce a working build of the software.

The M100 ballot counter is designed to load and dynamically execute binary files that are stored on the PCMCIA card containing the election definition (A.12) in cleartext without effective integrity protection (A.1).

Or, in other words, election officials must never, ever believe the results they get from electronic vote tabulation without doing a suitable random sample of the paper ballots, by hand, to make sure that the paper ballots are consistent with the electronic tallies. (Similarly fundamental vulnerabilities exist with other vendors’ precinct-based optical scanners.)

The M100 design documentation contains a specification of the data structure layout for information stored on the PCMCIA card. The reviewer compared the actual structures as defined in the source code to the documentation, and none of the actual structures matched the specification. Each one showed significant differences to or omissions from the specification.

I require the students in my sophomore-level software engineering class to keep their specs in synchrony with their code as their code evolves. If college sophomores can do it, you’d think professional programmers could do it as well.

The user’s guide for the Election Reporting Manager describes how a password is constructed from publicly-available data. This password cannot be changed, and anyone reading the documentation can use this information to deduce the password. This is not an effective authentication mechanism.

While this report doesn’t get into the ES&S iVotronic, the iVotronic version 8 systems had three character passwords, fixed from the factory. (They apparently fixed this in the version 9 software which is now already a few years old.) You’d think they would have gone around and fixed this issue elsewhere in their software, since it’s so fundamental.

A.4 “EDM iVotronic Password Scramble Key and Algorithm”: A hardcoded key is used to obfuscate passwords before storing them in a database. The scrambling algorithm is very weak and reversible, allowing an attacker with access to the scrambled password to retrieve the actual password. The iVotronic is supported by the Unity software but is not being used for California elections.

Well, okay, maybe they didn’t fix the iVotronic passwords, then, either. Other passwords throughout the system are similarly hard-coded and/or poorly stored. And, given that, you can trivially tamper with any and all of the audit logs in the system that might otherwise contain records of what damage you might have done.

In the area of cryptography and key management, multiple potential and actual vulnerabilities were identified, including inappropriate use of symmetric cryptography for authenticity checking (A.9) and several different very weak homebrewed ciphers (A.4, A.7, A.8, A.11). In addition, the code and comments indicated that a checksum algorithm that is suitable only for detecting accidental corruption is used inappropriately with the claimed intent of detecting malicious tampering (A.1).

We’ve seen similar ill-conceived mechanisms used by other vendors, so it’s similarly unsurprising to see it here. The number one lesson these vendors should take home is thou shalt not implement thine own cryptography, particularly when the stuff they’re doing is all pretty standard and could be pulled from places like the OpenSSL library support code. And even then, you have to know what you’re doing. As Aggelos Kiayias once quipped, don’t use cryptography; use a cryptographer.

The developers generally assume that input data will be supplied in the correct expected format. Many modules that process input data do not perform data validation such as range checks for input numbers or checking validity of internal cross references in interlinked data, leading to potentially exploitable vulnerabilities when those assumptions turn out to be incorrect.

They’re talking about buffer overflow vulnerabilities. This is one of the core techniques that an attacker might use to gain leverage. If an attacker compromises one solitary memory card on its way back to Election Central, then corrupt data on that might be able to attack the tabulation system, and thus effect the outcome of the entire election. This report doesn’t contain enough information for us to conclude whether ES&S’s Unity systems are vulnerable in this fashion, but these are exactly the kinds of poor development practices that enable viral attacks.

Finally, a few summary bullets jumped out at me:

  • The system design does not consistently use privilege separation, leading to large amounts of code being potentially security-critical due to having privileges to modify data.
  • Unhelpful or misleading comments in the code.
  • Subjectively, large amount of source code compared to the functionality implemented.

Okay, let’s get this straight. The code is bloated, the comments are garbage, and the system is broadly not engineered to restrict privileges. Put that all together, and you’re guaranteed a buggy, error-prone, security vulnerable program that must be incredibly painful to maintain and extend. This is the kind of issue that leads smart companies to start over from scratch (while simultaneously supporting the old version until the new version gets up to speed). Is ES&S or any other voting system vendor doing a from-scratch implementation in order to get it right? They’ll never get there any other way.

[Sidebar: I live in Texas. Texas’s Secretary of State, like California’s, is responsible for certifying voting equipment for use in the state. If you visit their web page and scroll to the bottom, you’ll see links for each of the vendors. There are three vendors who are presently certified to sell election equipment here: Hart InterCivic, ES&S and Premier (née Diebold). Nothing yet published on the Texas site post-dates the California or Ohio studies, but Texas’s examiners recently considered a new submission from Hart InterCivic. It will be very interesting to see whether they take any of the staggering security flaws in Hart’s system into consideration. If they do, it would be a big chance for Texas to catch up to the rest of the country. Incidentally, I have offered my services to be on Texas’s board of election examiners on several occasions. Thus far, they haven’t responded. The offer remains open.]

The Security Mindset and “Harmless Failures”

March 26th, 2008 by Ed Felten

Bruce Schneier has an interesting new essay about how security people see the world. Here’s a sample:

Uncle Milton Industries has been selling ant farms to children since 1956. Some years ago, I remember opening one up with a friend. There were no actual ants included in the box. Instead, there was a card that you filled in with your address, and the company would mail you some ants. My friend expressed surprise that you could get ants sent to you in the mail.

I replied: “What’s really interesting is that these people will send a tube of live ants to anyone you tell them to.”

Security requires a particular mindset. Security professionals — at least the good ones — see the world differently. They can’t walk into a store without noticing how they might shoplift. They can’t use a computer without wondering about the security vulnerabilities. They can’t vote without trying to figure out how to vote twice. They just can’t help it.

This kind of thinking is not natural for most people. It’s not natural for engineers. Good engineering involves thinking about how things can be made to work; the security mindset involves thinking about how things can be made to fail. It involves thinking like an attacker, an adversary or a criminal. You don’t have to exploit the vulnerabilities you find, but if you don’t see the world that way, you’ll never notice most security problems.

I’ve often speculated about how much of this is innate, and how much is teachable. In general, I think it’s a particular way of looking at the world, and that it’s far easier to teach someone domain expertise — cryptography or software security or safecracking or document forgery — than it is to teach someone a security mindset.

The ant farm story illustrates another aspect of the security mindset. Your first reaction to the might have been, “So what? What’s so harmful about sending a package of ordinary ants to an unsuspecting person?” Even Bruce Schneier, who has the security mindset in spades, doesn’t point to any terrible consequence of misdirecting the tube of ants. (You might worry about the ants’ welfare, but in that case ant farms are already problematic.) If you have the security mindset, you’ll probably find the possibility of ant misdirection to be irritating; you’ll feel that something should have been done about it; and you’ll probably file it away in your mental attic, in case it becomes relevant later.

This interest in “harmless failures” — cases where an adversary can cause an anomalous but not directly harmful outcome — is another hallmark of the security mindset. Not all “harmless failures” lead to big trouble, but it’s surprising how often a clever adversary can pile up a stack of seemingly harmless failures into a dangerous tower of trouble. Harmless failures are bad hygiene. We try to stamp them out when we can.

To see why, consider the donotreply.com email story that hit the press recently. When companies send out commercial email (e.g., an airline notifying a passenger of a flight delay) and they don’t want the recipient to reply to the email, they often put in a bogus From address like donotreply@donotreply.com. A clever guy registered the domain donotreply.com, thereby receiving all email addressed to donotreply.com. This included “bounce” replies to misaddressed emails, some of which contained copies of the original email, with information such as bank account statements, site information about military bases in Iraq, and so on. Misdirected ants might not be too dangerous, but misdirected email can cause no end of trouble.

The people who put donotreply.com email addresses into their outgoing email must have known that they didn’t control the donotreply.com domain, so they must have thought of any reply messages directed there as harmless failures. Having gotten that far, there are two ways to avoid trouble. The first way is to think carefully about the traffic that might go to donotreply.com, and realize that some of it is actually dangerous. The second way is to think, “This looks like a harmless failure, but we should avoid it anyway. No good can come of this.” The first way protects you if you’re clever; the second way always protects you.

Which illustrates yet another part of the security mindset: Don’t rely too much on your own cleverness, because somebody out there is surely more clever and more motivated than you are.

Sequoia’s Explanation, and Why It’s Not the Whole Story

March 20th, 2008 by Ed Felten

I wrote yesterday about discrepancies in the results reported by Sequoia AVC Advantage voting machines in New Jersey.

Sequoia issued a memo giving their explanation for what might have happened. Here’s the relevant part:

During a primary election, the “option switches” on the operator panel must be used to activate the voting machine. The operator panel has a total of 12 buttons numbered 1 through 12. Each party participating in the primary election is assigned one of the option switch buttons. The poll worker presses a party option switch button based on the voter authorization slip given to the voter after signing the poll book, and then the poll worker presses the green “Activate” button. This action causes that party’s contests to be activated on the ballot face inside the voting booth.

Let’s assume the Democrat party is assigned option switch 6 while the Republican Party is assigned options switch 12. If a Democrat voter arrives, the poll worker presses the “6″ button followed by the green “Activate” button. The Democrat contests are activated and the voter votes the ballot. For a Republican voter, the poll worker presses the “12″ button followed by the green “Activate” button, which then activates the Republican contests and the voter votes the ballot. This is the correct and proper method of machine activation when using option switches.

However, we have found that when a poll worker selects the lower of the two assigned selection codes, followed by pressing an unused selection code and then pressing the green “Activate” button, the higher numbered party on the operator panel has its contests activated instead while the selection code button for the original party stays active on the operator panel.

Using the above example with the Democrat Party as option switch 6 and the Republican Party as option switch 12, the poll worker presses button 6 for Democrat. The red light next to button number 6 lights up and the operator panel display will show DEM. The poll worker then presses any unused option switch. The red light stays lit next to option switch 6 and the display still says DEM. Now the poll worker presses the green “Activate” button. The red light stays lit next to button number 6, but the operator panel display now says REP and the ballot in the voting booth will activate the Republican party contests.

In each and every case where a machine displays the party turnout issue at the close of the polls, this is the situation that would have caused it, and it can be duplicated on any machine. In addition, for this situation to have occurred, the voter that was in the voting booth at the time of the poll workers action would have voted the opposite party ballot instead of telling the poll worker that the incorrect ballot was activated and the machine would not allow them to vote the party they intended. If they had informed the poll worker, they could have made the party selection change and the voter would have then voted the correct ballot style.

Several points are in order.

First, it’s obvious from this description, and from the fact that this happened on so many machines across the state, that even if Sequoia’s explanation is entirely correct, there was some kind of engineering error on Sequoia’s part that caused the machines to misbehave. Sequoia has tried to paint the anomalies as poll worker error, but that’s not plausible in light of Sequoia’s own explanation.

Consider the scenario described above: there is a moment when the red light next to the DEM button is lit, the operator panel displays DEM, then the poll worker presses the Activate button — and the Republican ballot is activated. No competent engineer would design a system to work that way.

No competent engineer would design this system to ever display REP in the operator panel while simultaneously lighting only the DEM light.

No competent engineer would design this system to ever activate the Republican ballot when the poll worker had pressed the DEM button but had not pressed the REP button.

Sequoia’s own explanation makes clear that they made an engineering error that caused the voting machine to behave incorrectly.

Second, this doesn’t look like fraud, only error. A malicious attacker who had access to a machine would have had much more powerful, and much less detectable, options at his disposal.

Third, Sequoia seems to avoid saying that what they describe is the only possible cause of such errors. Note the careful wording, “In each and every case where a machine displays [an error], this is the situation that would have caused it …” (emphasis added). They don’t say this “did” cause the errors; they say it “would have”. The sentence is either clumsy or artfully worded.

Fourth, Sequoia’s explanation involves a voter seeing the wrong party’s ballot being activated, and not complaining about it. Assuming (as press accounts say) that the problem happened about sixty times in New Jersey, one would expect that many voters noticed and complained. And one would expect that in at least one of those cases, a poll worker would have noticed that the operator panel was displaying REP and DEM at the same time. Yet there don’t seem to be reports of such behavior.

Fifth, Sequoia doesn’t characterize fully the cases where this problem might occur, so election officials don’t know, for example, which past elections might have been affected.

The bottom line is clear. An investigation is needed — an independent investigation, done by someone not chosen by Sequoia, not paid by Sequoia, and not reporting to Sequoia.


mobile phone