“Very quickly, with a few bits of information, everyone is unique,” said Dr. Erlich.
One possible solution is to control access. Those who want to use sensitive data — medical records, for example — would have to access them in a secure room. The data can be used but not copied, and whatever is done with the information must be recorded.
Researchers also can get to the information remotely, but “there are very strict requirements for the room where the access point is installed,” said Kamel Gadouche, chief executive of a research data center in France, C.A.S.D., which relies on these methods.
The center holds information on 66 million individuals, including tax and medical data, provided by governments and universities. “We are not restricting access,” Mr. Gadouche said. “We are controlling access.”
But there is a drawback to restricted access. If a scientist submits a research paper to a journal, for example, others might want to confirm the results by using the data — a challenge if the data were not freely available.
Other ideas include something called “secure multiparty computation.”
“It’s a cryptographic trick,” Dr. Erlich said. “Suppose you want to compute the average salary for both or us. I don’t want to tell you my salary and you don’t want to tell me yours.”
So, he said, encrypted information is exchanged that is unscrambled by a computer.
“In theory, it works great,” said Dr. Erlich. But for scientific research, the method has limits. If the end result seems wrong, “you cannot debug it, because everything is so secure you can’t see the raw data.”
The records gathered on all of us will never be completely private, he added: “You cannot reduce risk to zero.”