Personal data and a new business model


Instead of thinking of the digital data as something collected by others and somehow used against you, it becomes a mechanism for you to get companies to send you information about things you actually want to buy.

Wordle of, located in the Washington, DC area, have built a personal data service that encourages users to enter personal information into Personal’s cloud-based vault.  The service allows people to organize their data into ‘gems’, then send this information to family, friends and business associates.  Here are some quick-hit videos that explain the company and the concept.

I have direct experience with personal data vaults and, frankly, the uptake on this type of service is currently poor.  It may well be a generational thing, and perhaps time has to pass before enough people will trust a cloud service with their secrets.

But I think that the real obstacle for existing personal vaults may well be the current ‘user pay’ business model.  People don’t see the value in a paid-for personal data service — but could they use a service that allows them to control and sell their own personal data?

Personal’s model anticipates a future where advertisers will seek out personal data from prospects and pay for the information.  Personal is hoping to capitalize on this by becoming the  broker for millions of personal data transactions, and take a percentage of the transaction fees as commissions.  We — as rightful owners of the data — get the rest!

Is this the future of personal data? Are we seeing a move away from intrusive data collection for the service operator’s profit alone (the Google and Facebook models) to a world where we own, control and reap the benefits of our own information?


Google’s latest privacy troubles

Update 05/27: Kim Cameron has an excellent post on this issue (and a clarification here) that illustrates the identity impacts of Google’s wifi scanning.

It would appear that Google’s Street View cars were actively collecting data from unprotected home wifi networks over the past several years.  According to the New York Times article:

After being pressed by European officials about the kind of data the company compiled in creating the archive — and what it did with that information — Google acknowledged on Friday that it had collected snippets of private data around the world. In a blog post on its Web site, the company said information had been recorded as it was sent over unencrypted residential wireless networks as Google’s Street View cars with mounted recording equipment passed by.

I’m not sure how to react to this but it sure raises some questions:

  • Why would the Street View cars be scanning for unprotected networks in the first place? The company has said it helps to improve geo-location but given the other tools at its disposal, I suspect they weren’t relying on home network MAC addresses to keep their location data accurate.
  • Why would they then record user data — web sites visited, emails sent, etc. — and subsequently store it on central servers? How can this be classified as a  ‘programming error’? Perhaps that explanation could fool some of the less technical authorities, but let’s get real here — systematic recording of user generated data when only the MAC address is needed IS NOT a programming ‘error’.  It is a ‘function’.
  • Why would this only come to light after four years and why did it take a demand from a German official to inspect the car’s missing hard drive for this to become public at all?
  • Are we getting the full goods from Google, a company known for its privacy transgressions?

Companies like Google (and Facebook, a company with privacy troubles of its own) are successful because of the goodwill and trust extended to them by us.  There are other search engines and cloud services out there we can use.

Breaches like this are bad enough — the pithy excuses and blatant PR spin when caught are even worse.


Passport Canada’s Retreat

Today’s Globe and Mail brought news of Passport Canada’s decision to abruptly cancel its online passport application system.  The online service allowed Canadians to fill out their passport application online, and was launched four years ago as a progressive example of e-government.

The reasons for removing the service was a lack of ‘convenience’ for passport applicants, and passwords were cited as examples of that inconvenience.  The system is being replaced by online forms that don’t require a user account and password to be used.  Presumably a user will now need to fill in the form on a web page, then print and bring the form into a Passport Canada office for processing.

Of course, Passport Canada has been under attack by the Canadian privacy commissioner and had an embarrassing security breach in late 2007.  The claim that the service was inconvenient due to the need for users to remember passwords is a bit suspicious — by this logic, we’d have wholesale dismantling of online government services and the requisite hiring frenzy to replace them with counter representatives…

A better explanation is that the agency clearly has decided that the risks of making passport data available on the web has exceeded the organization’s tolerance levels. And good for them.  Until they are able to deliver highly secured system, or reduce the amount of data accessible online, the passport application should be removed.


Identity Providers in Government

Most of my consulting projects are delivered in the public  sectors: higher education, central services, municipal and, a few years ago, in the health department. Until recently, my projects have involved implementing systems to deliver identity and access management — usually on a deadline, usually for a specific application or set of applications.

But I have also had the opportunity to work on more conceptual projects including defining an IdM strategy for a government department. Starting in the next few weeks, my team will begin architecting and designing federated and user-centric identity solutions.

The first thing we are working through are use cases that will help drive out solution designs.  We already know what the technologies are capable of and we have selected the products we need to conduct the proofs-of-concept.  But what are we going to do with this technology?

If we break down the emerging identity model into Identity Providers, Service Providers and Users, we can define actions in our use cases by these actors.  This post starts the discussion with what makes a good Identity Provider (IdP).  Specifically, the discussion is around the Citizen to Government context.

Who should act as an IdP? I believe that there are actually a limited number of government organizations that can fulfill this role.  While many government departments (and divisions and branches within those departments) might maintain citizen information, only a few actually maintain citizen registries.  And it is these registries — legislated databases of citizen information that are highly secured and carefully maintained — that are ideally suited to supporting an IdP implementation.

Why? Because citizen registries matter.  These databases are consistently used for identification to support both real-world and electronic transactions. A registry of citizens is used to support eligibility for health services in a province. A registry of drivers allows for issuance of drivers licenses and enforcement of road use laws. Student registries ensure that the right student gets credit for exam results, course marks and certifications.  The tax department keeps a reliable registry of citizen tax payers.

In my home province there are also registries for seniors, land titles, vital statistics, children/youth, and a perhaps few others — but that’s about it.  Federally we have citizen registries for taxation, family benefits, veterans, guns (well… the people that own them), and others.

The point to all this is that there are a finite number of authoritative sources of citizen identity information.  It therefore makes sense to leverage these databases for purposes of building reliable identity provider services.

I would even take it a step further — it makes very little sense to build a citizen IdP that is not built on a government registry. Why? Because the legislative authority to build a registry — and the effort to maintain it over time — are not trivial things. Therefore, government departments that contain registries take the job seriously. Registries are secured, monitored and carefully updated.  They often contain key identity attributes such as legal name, date of birth and residential address. Registries are subject to review by provincial and national privacy commissioners. Some registries contain some unique information as well, such as relationships: parent to student, husband to wife, driver to vehicle.

In the event of problems, bodies that manage registries have processes for citizens to correct information to contained in these databases. Most of us care very much if a registry does not have our correct information.  Errors can lead to late payments, loss of hard-earned certifications or denial of critical services.  For example, if the tax department mis-spells our name, it is difficult to cash our refund cheque and we’ll be certain to correct them at the first opportunity.

To further bring this point home, consider the municipal property tax role.  The city maintains this database and it is important to them that the rate payer be linked to correct property.  They want to know who to contact if taxes are in arrears or if a ticket needs to be issued for icy sidewalks.  But municipalities don’t deliver most of the type of life-sustaining or entitlement services that truly matter to us.  Cities and towns also don’t have a business need to record useful identity information like date of birth or gender.  If my city tax assessment arrived and my name was spelled incorrectly, I would probably ask that it be changed, but there would be limited consequences if I didn’t. For these reasons, the city’s tax role would make a poor choice as an IdP.

But a municipal government still needs to deliver services based on citizen entitlements, and identification can play an important role in electronic service delivery.  So if my city’s own databases are poor choices, where should they turn?  To a higher level of government, namely a provincial or federal IdP based on a robust registry.  By establishing an agreement with one or more registry-based IdPs, my city can focus on delivery services — acting as a Service Provider — and leave the more difficult identification and authentication of citizens to an IdP.

Finally, the idea of using registries is aligned with the Pan-Candian Identity Management and Authentication Framework.  While use of registries is not specifically prescribed, the concepts presented in the Identity Component — identity context, identity lifecyle, identity assurance levels and identity relationships — seem to map well when considering registries as ‘sources of truth’ for identity.

There will be a proliference of IdP services established over the next decade so the quality of identity proofing — especially for establishing credentials that are use in higher value transactions — is critical.  Establishing Identity Providers that are based on government registries will be key to the success of future identity management and electronic service delivery initiatives.


PS2009 — Winn Schwartau

Feb 4th, 9:40am
Live blog post…

Winn Schwartau is the President of Interpact Inc. He explains how easy it is to gather information on an individual; medical, financial and legal information are all available using a range of free and paid Internet services.

Key concerns:
– On the Internet today, there are approx. 500,000 databases containing personal information.
– Virtually no regulation exists to protect privacy especially in the US.
– No-one reads usage agreements that outline what a company can do with our data.
– Privacy rules/laws difficult to set because technology changes so rapidly.
– 75 percent of US residents have had data on them lost or stolen.

He makes a number of interesting points:
– Why can’t we treat our personal details as copyrighted information? Why can’t we own our own names?
– The questions are ethical not legal.
– We need to redefine ‘public domain’ to mean ‘for the public good’.
– We should be able to tell companies that they can only use our information for one transaction (unless we order otherwise).
– We must be able to request and receive all information held on us by companies.
– We must have data error repair rights and, if possible, some recourse for abuse.
– Need leadership and global cooperation to bring about change.

Interesing and thought provoking, more info at


PS2009 — ePETs

Feb 2nd, 1:00pm

I wasn’t too sure where I’d spend the second part of the workshop day, so I wandered into this panel discussion led by Canadian privacy celebrity Ann Cavoukian and MITRE Corporation’s Dr. Stuart Shapiro:

The MITRE Corporation with the Information and Privacy Commissioner’s Office of Ontario

This session is intended to explore the area of ePETs, which are aimed at supporting privacy within large organizations that must appropriately handle and safeguard large amounts of personally identifiable information (PII) throughout the information life cycle. The dominant focus of traditional PET research and development has been tools to enable data subjects to protect their personal privacy, typically by preventing the collection of PII. There is a growing need, though, for tools that can help data stewards responsibly manage the PII in their possession in accordance with Fair Information Practices.

Okay, so lets start with this: ePETS are electronic Privacy Enhancing Technologies.  Most of the technology discussed was a tool from a researcher named Kaled El Emam from the University of Ottawa.  He has developed a set of tools for anonymizing data for research purposes.  This technology has potential to greatly increase the type of information that a hospital or government organization can share with the research community by quantitatively measuring the degree to which the data has been anonymized before it is released.

The panel discussion produced a number of interesting quotes:

  • ‘Don’t let the perfect get in the way of the good’ — Joseph Alhadeff, Oracle Corp.
  • ‘… an Identity Resolution Service is needed’ — Charmaine Lowe, Director at the BC Ministry of Labour and Citizens’ Services, responding to a question related to resolving mis-information in government registries.
  • ‘Federated Privacy Impact Assessment tools are coming’ — Ann Cavoukian, Information and Privacy Commissioner for Ontario.
  • ‘… [government needs to] consider marginalized citizens who cannot produce the required identity proofing documentation when registering for programs and identity management systems’ — attendee from the Insurance Corporation of British Columbia explaining how he sees many cases each year where, for various reasons, people simply do not have the birth certificates, passports, citizenship cards, etc. required for registration

Overall, I thought this was a well-balanced discussion that reflected on how the practical needs of researchers can be met without eroding the privacy protections we expect organizations to provide us.


Identity Assurance — Information Classification


2nd in a series [ <- previous ]

The first component of the Pan-Canadian Assurance Model to review is the Security Classification of Information.

There are a number of ways that organizations can classify their information.  The Pan-Canadian model uses the Canadian Public Sector Security Classification Guideline developed by the National CIO Subcommittee on Information Protection (NCSIP).  The guideline is quite straight-forward as its justification is summarized by the following quote:

…when electronic information is shared with external jurisdictions that are not aware of the value or sensitivity of an information asset, it becomes essential that the classification rating be established so that the information protection requirements can be quickly understood, communicated, and acted upon.

Here are the guideline’s information classifications and my interpretations of each:

  • 1. Unclassified — Typically publicly available information.  A breach, loss or unauthorized modification would not result in injury to individuals or organizations.
  • 2. Low — Basic information about an individual, internal administrative systems or the status of a government process.  Breaches of Low level information could cause significant injury (in the legal sense) to individuals or organizations including financial loss, service level impacts and/or embarrassment.
  • 3. Medium — Medical/health information, an individual’s tax information, trade secrets, identity information that could be used to support fraud, etc.  Breach or loss “could reasonably be expected to cause serious personal or enterprise injury” including significant financial loss, legal action, etc.
  • 4. High — Cabinet documents, oil & gas exploration data, criminal case information, information on a police informant, etc.  If information rated High were breached, stolen or modified without authorization, “extremely serious” injury to individuals or organizations could be expected to occur.

The above provides general guidelines on how information should be classified.  It is fairly consistent with classification models I’ve used in the past, and the examples in the guideline are quick easy to understand.

In Practice:

When using this type of guidance, it is important to develop your own examples so that the information your organization manages is used for illustrative purposes.  For example, if you manage permits or licenses, be clear as to how that information is classified both during and the permit application process and after it is completed.

Educate your business clients on the guideline.  IT staff must NOT make the classification decision, nor should they influence the decision-makers one way or the other. Business owners and ‘information stewards’ need to classify the information.  Use workshop settings to uncover information being managed, and use the examples to define information classifications.

It is important to perform information classification activities outside the activity to select security controls.  While it is very true that the classification will impact the controls you select, the knowledge that costly or difficult-to-implement controls might be needed must not influence the way information is classified.

Document the classifications — this could be done by each business area, or recorded centrally as an appendix to information classification standards or information management documentation.

The information classifications are mapped to the right most column in the model.  This column, titled “Potential Impact of Identification/Authentication Error” will drive the remainder of the identity assurance analysis.

Next: Trust Levels.

Shared secrets for establishing identity

We are all familiar with the use of shared secrets for establishing our identity when we do business online or over the phone.  These secrets are things like account numbers, our mother’s maiden name or a dollar amount from a recent statement.

Shared secrets are very useful because they significantly reduce the chances that an imposter can gain access to our information by guessing the information being requested.  Shared secrets are also used when digital credentials are first established, and this is an area of significant interest in the public sector where potentially millions of users need to be efficiently enrolled into government services.

Further, both quantity and quality matter.  As governments strive to move more services online, the question of ‘who is at the end of the wire’ takes on more and more significance.  When digital credentials are being used to access confidential data, the impact of improperly identifying an individual can be catastrophic for both the public authority and the individual.

  • A single shared secret on its own makes a poor choice for identifying an individual.  In almost all cases, even those where non-confidential or low-value transactions are taking place, multiple shared secrets are needed to ensure appropriate identity assurance is carried out.
  • The quality of the shared secret is also critically important.  Using a secret that is relatively easy to obtain — e.g. a professional certification number that is displayed on a certificate in the individual’s outer office — is of less value in identity assurance than a secret that is known only to the user.

The best identity assurance schemes are therefore those that use multiple strong shared secrets — information that only the user would generally have access to and information that, typically, is not known by others.

This last point is somewhat critical.  Sharing of confidential information in a household is very common: spouses open each other’s mail; report cards and bank account statements are left in plain view; and personal details such as birthdates are commonly known throughout the household.

A well-constructed identity assurance process must therefore also consider the degree to which shared secrets are known amoung a household, workplace or other group of individuals.

Fortunately government organizations have a wealth of citizen information in their databases.  These stores of shared secrets allows a government system to select from a range of options when validating user identity.

An effective enrolment solution depends on carefully analyzing the strength and appropriate combination of multiple secrets in order to select the best ones for e-government applications.


Privacy and the customer

I attended a rather good security and privacy conference earlier this year in Victoria, BC, and one of the presenters made some very interesting observations on privacy.  David Skillicorn from Queen’s University presented ‘Businesses, Customers and Relationship’

 There are some real insights in this material and if you get a chance to see this fellow’s presentation in person, don’t miss it.  Here are some of the more interesting ideas that Mr. Skillicorn offered up:

1. Privacy is new, it didn’t existing until the last century or so.  In fact, while privacy is now considered a right in most societies, throughout human history it barely even existed as a concept.

2. Consumers get better product and service offerings when a company knows something about them.  Product offerings can be tailored, discounts offered.  What information would you give up to get a $10 off grocery coupon?

3. Consumers consider companies to be ‘friends’ and want to develop long-lasting relationships (really!).  Trust plays a significant role in whether this relationship continues, and privacy breaches can effectively destroy that trust.  A growing number of data breaches is proving this to be true — one estimate I heard at the conference was that a mis-handled data breach could cause a company to lose 20 to 25% of its customers!

He summarized by saying that businesses need to understand that while data is very easy to collect from their customers, the company’s data use and protection policies must be strong to ensure future breaches of trust do not occur.  This, in theory, should drive more responsible data mining and data management practices.

My guess is that it will take some time before quarterly-results-obsessed marketers come to this realization.  As a result, the frequency and severity of data breaches will increase in the next few years…