Protecting Privacy in Libraries as AI Adoption Accelerates

By: Anusha Nasrulai

Like picking what movie to watch, what restaurant to eat at, or where to go on vacation, what we read next is often recommended to us by personalization algorithms. Social media or reading platforms such as Goodreads already process user data to generate recommended content. Recently, the library catalog browsing app, Libby, announced its own book recommendation feature, Inspire Me.

Inspire Me by Libby recommends users’ books based on their own prompts or previously saved titles in the app. Originally announced as an optional feature, Inspire Me features prominently at the top of the home screen when users open the Libby App. The feature recommends books available through the catalogs of the libraries which users have linked accounts with. When the feature was first announced, users and libraries showed resistance, voicing concerns about forced AI adoption and diminished patron privacy. OverDrive, the parent company of Libby, states that readers’ personally identifying data and reading activity are not provided to the AI model.

Libraries work with vendor platforms, distributors, and publishers to deliver library services, particularly for e-materials. Despite popular backlash, vendors are expanding development of AI integrations. OverDrive CEO Steve Potash has announced goals to use AI to “match users to content across its platforms,” which also include streaming platform Kanopy, and k-12 education platform Sora. Other subscription vendor companies, such as OCLC, EBSCO, and Clarivate, have introduced AI features for content recommendation, enhanced search, text summaries, and AI-generated research assistants. Beyond externally marketed AI tools, vendors are incorporating AI into their internal workflows for “building, improving, and refining products.” Libraries now are finding the balance between their duty to protect patron data and privacy and providing access to digital resources.

Legal Regulations

The integration of AI by vendor platforms poses new privacy considerations for libraries. AI introduces new risk points at data collection, processing, training, and deployment.

The United States currently has no comprehensive AI or data privacy laws. Instead, states have passed dozens of laws regulating certain AI use cases. As of now, 6 states have passed cross-sectoral AI governance laws that apply to commercial entities. Vendors are likely subject to state-level AI and data privacy laws that target commercial entities. Libraries can leverage legal regulations to negotiate with vendors for stronger privacy protections. Trends in AI regulations show that states are increasingly passing and updating AI legislation amid legal challenges and an absence of federal regulation

AI Governance and Contracting

In light of legal uncertainties, contracts and licenses are a key opportunity for imposing guardrails on AI use. These agreements address how vendors and third parties can collect, process, and disclose user data.

More often, vendor agreements will not explicitly disclose internal use of AI tools or AI model training. Research and policy organization, Library Futures, and staff attorney, Layla Maurer, presented on this issue, flagging that broad language around operational mechanisms and data usage may permit vendors to train and deploy AI models using patron and institutional data. When reviewing vendor contracts for AI usage, libraries should focus on:

  • Vendor’s rights around data use and sharing, including with third parties. Use of patron data for “analytics” or “development and improvement of services” may include AI training.
  • References to third-party applications or tools, processors, or contractors necessary to carry out services under the agreement.
  • Whether there is a defined data retention period and what happens to patron data when the contract ends

Libraries can strengthen contract terms by including language requiring compliance with applicable federal and state laws, as well as with industry standards such as ISO and NIST. In addition, libraries may negotiate with vendors to:

  • Define user rights to data, including the right to opt out of nonessential data collection and the right to delete their data.
  • Limit secondary uses of data, including for training internal or external AI tools
  • Disclose third party partners and whether data is shared or sold to third parties
  • Conduct privacy and security audits
  • Establish a data retention period and protocol for destroying data at the end of the retention period

As said by attorney Layla Maurer, “Updating contract language to allow flexibility around software development needs while retaining safeguards for what the licensee… wants to protect is not just an expeditious way to reach an agreement with a software vendor, it’s also a strategy that helps ensure the licensee can continue to safely use the software despite future legislative changes provided the vendor updates their software in a manner consistent with the intent of the legislation.”

Future-proofing

Digital lending and services are a popular means of accessing materials from libraries, but at the same time, raise new challenges for protecting patron privacy. Therefore, as AI becomes embedded in services, libraries need to adopt AI guardrails in contracting to manage the harms and opportunities related to AI use in libraries, particularly around privacy.

Your Face Says It All: the FTC Sends a Warning and Rite Aid Settles Down

By: Caroline Dolan

If someone were to glance at your face, they wouldn’t necessarily know if you won big in Vegas or if you’re silently battling a gambling addiction. When you stroll down the street, your face can conceal many a secret, even such a lucrative side hustle. While facial recognition (“FR”) software is not a new innovation, deep pockets are investing a staggeringly large amount of money into the FR market. Last year, the market was globally valued at $5.98 billion and is projected to grow at a compound annual growth rate of 14.9% into 2030. This rapid and bold deployment of facial recognition technology may therefore make our faces more revealing than ever, transforming them into our most valuable—yet vulnerable—asset.

A Technical Summary for Non-Techies

Facial recognition uses software to assess similarities between faces and provide determinations. Facial characterization further classifies a face based on individual characteristics like gender, facial expression, and age. Through deep learning AI, artificial neural networks mimic how our brains process data. The neural network consists of various layers of algorithms which process and learn from training data, like images or text, and eventually develop the ability to identify features and make comparisons.

However, when the dataset used to train the FR model is unrepresentative of different genders and races, a biased algorithm is created. Training data that is biased toward certain features creates a critical weak spot in a model’s capabilities and can result in “overfitting” wherein the machine learning model performs well on the training data but poorly on data that is different from which it was trained. For example, a model that is trained on data that is biased towards images of men with Western features will likely struggle to make accurate determinations on images of East Asian females.

Data collection and curation poses its own set of challenges, but selection bias is a constant risk whether training data is collected from a proprietary large language model (“LLM”), which requires customers to purchase a license with restrictions, or from an open-source LLM, which is freely available and provides flexibility. Ensuring that training data represents a variety of demographics requires AI ethic awareness, intentionality, and potentially federal regulation.

The FTC Cracks Down

In December of 2023, Rite Aid settled with the FTC following the agency’s complaint alleging that the company’s deployment of FR software was reckless and lacked reasonable safeguards, resulting in false identifications and foreseeable harm. Between 2012 and 2020, Rite Aid employed an AI FR program to monitor shoppers without their knowledge and flag “persons of interest.” Those whose faces were deemed a match to one in the company’s “watchlist database” were confronted by employees, searched, and often publicly humiliated before being expelled from the store. 

The agency’s complaint under section 5 of the FTC Act asserted that Rite Aid recklessly overlooked the risk that its FR software would misidentify people based on gender, race, or other demographics. The FTC stated that “Rite Aid’s facial recognition technology was more likely to generate false positives in stores located in predominantly Black and Asian neighborhoods than in predominantly white communities, where 80% of Rite Aid stores are located.” This also violated Rite Aid’s 2010 Security Order which required the company to oversee its third-party software providers.  

The recent settlement prohibits Rite Aid from implementing AI FR technology for five years. It also requires the company to destroy all data that the system has collected. The FTC’s stipulated Order imposes various comprehensive safeguards on “facial recognition or analysis systems,” defined as “an Automated Biometric Security or Surveillance System that analyzes . . . images, descriptions, recordings . . . of or related to an individual’s face to generate an Output.” If Rite Aid later seeks to implement an Automated Biometric Security or Surveillance System, the company must adhere to numerous forms of monitoring, public notices, and data deletion requirements based on the “volume and sensitivity” of the data. Given that Rite Aid filed Chapter 11 bankruptcy in October of 2023, the settlement is pending approval by the bankruptcy court while the FTC’s proposed consent Order goes through public notice and comment.

Facing the FutureGoing forward, it is expected that the FTC will remain “vigilant in protecting the public from unfair biometric surveillance and unfair data security practices.” Meanwhile, companies may be incentivized to embrace AI ethics as a new component of “Environmental, Social, and Corporate Governance” while legislators wrestle with how to ensure that automated decision-making technologies evolve responsibly and do not perpetuate discrimination and harm.

Remote Test Scans Expose Larger Privacy Failures

By: James Ostrowski

In a major challenge to pandemic remote learning practices, the court in Ogletree v. Cleveland State University ruled that scanning students’ rooms violates the Fourth Amendment’s prohibition against unreasonable searches. While this decision is a definitive rebuke of a widely used practice, the case also reveals systemic flaws in university privacy practices. This blog will build off Ogletree to strike a balance between test integrity and privacy rights. 

Covid Acceleration 

For technology companies, the coronavirus pandemic was an accelerant. Startups rushed out messaging apps, video platforms, and ecommerce sites to thaw a populace frozen by a blizzard of lockdowns. There was perhaps no greater market capture for technology companies than in education. Colleges moved entirely online, deploying previously known but relatively new technologies, such as Zoom, on an unprecedented scale. Legions of students attended class from their kitchen tables and bedrooms. Professors, intent on maintaining their in-person standards in a remote world, relied on proctoring tools, many of which required room scans from students who had little choice but to comply. Now, two years later, hundreds of programs still record students throughout remote tests. 

Remote Test Scans Ruled Unconstitutional 

In February 2021, a student at Cleveland State University, Aaron Ogletree, was sitting for a remote chemistry exam when his proctor told him to scan his bedroom. He was surprised. Ogletree assumed the room scan policy had been abolished, until, two hours before the test, Cleveland State emailed him that he would have to scan his room. Ogletree responded that he had sensitive tax documents exposed and could not remove them. Like many students, Ogletree had to stay home due to health considerations, and he could only take exams in the bedroom of his house. Faced with the false choice of complying with the search or failing the test, he panned his laptop’s webcam around his bedroom for the proctor and all the students present to see. 

Ogletree sued Cleveland State for violating his Fourth Amendment rights. The Fourth Amendment protects “[t]he right of the people to be secure in their persons, houses, papers, and effects against unreasonable searches and seizures.” 

Ohio District Court judge J. Philip Calabrese decided in favor of the student because of the heightened Fourth Amendment protection afforded to the home, the lack of alternatives for Ogletree, and the short notice. Calabrese conceded that this intrusion may have been minor, but cited Boyd v. United States to support the slippery slope argument that “unconstitutional practices get their first footing…by silent approaches and slight deviations.” 

The facts of this case are a symptom of a larger problem. The university failed its students and its professors when it did not consistently apply its online education technology. 

Arbitrary Application and Lack of Policies 

Cleveland State provides professors with an arsenal of services to administer online classes. These tools include a plagiarism detection system that faculty can use to see students’ IP addresses, a proctoring service that records students and uses artificial intelligence to flag suspicious behavior, and, of course, pre-test room scans.

The school leaves it entirely to the discretion of faculty members—many of whom are not experts in student privacy—to choose which tools or combinations of tools to use. Cleveland State’s existing policies offer no guidance on the tradeoffs of using any one method. This is tantamount to JetBlue asking its pilots to fly through a whiteout without radar.

Toward a Unified Policy

What may have been an understandable oversight in the early pandemic whirlwind cannot be considered so now. The tension between privacy and security is well-known. Only by careful balancing of students’ privacy rights and university interest in test integrity will we find a workable solution. Schools across the country should take heed of the Ogletree ruling. University leadership holds the responsibility to balancing those interests and impart clear guidance to test administrators. To foster this progression, we offer two recommendations: 

  1. Cost-Benefit Guidance: The university should score tools on privacy interests involved and the expected benefit of its application. This should include guidance on whether a method can be easily circumvented. As individual teachers are not necessarily savvy on the legal implications of certain remote test policies, the university must provide clear analysis and guidance. An example entry may read, “Blackboard provides student location data. Though location tracking is a relatively common practice, students must be made aware of it. This tool can ensure that students are where they say there are, which is not usually relevant for test integrity. If students wished, they could easily evade this using a low-cost VPN.” 
  1. Test Policy Clearly Outlined in Syllabi: Professors should provide guidance within their course descriptions on what technologies and methods are used to administer tests, and students could sign an acknowledgment form. For example, a professor would delineate applications they use to administer exams, information about whether the exams are proctored, and recourse for not following a policy. This way, students can make affirmative decisions about their privacy exposure by choosing a course that aligns with their interests rather than be blindsided by heavy-handed policy in the final weeks of a semester. This way, professors will not have to worry about future disagreements because their students knowingly consented to the course’s policies.

The university must balance policy considerations around security and privacy rights. A failure to balance these conflicting pursuits can cause student anxiety, unnecessary privacy violations, and poor test integrity.

Plugging-in Your EV? More Like Plugging-in Your Data.

By: Caroline Dolan

As global warming and ecological degradation progress, sustainable technology and infrastructure is being implemented to remediate and prevent aggravation. However, electric vehicles (EVs), which are an effective way to curb carbon emissions and boost green efforts, pose a unique set of privacy risks every time we plug-in.

The data transaction: Plugging-in

EVs are dependent on EV chargers and for the majority who do not have the capacity to charge at home, public chargers are a necessity. Public EV chargers are essentially an Internet of Things (IoT) device that facilitate the transaction of data for kilowatts. Information involving pricing, session date, time, duration, and power patterns is collected and sent to the operator’s network. Furthermore, most chargers are affiliated with a mobile-app or use a radio-frequency identification card (RFID) implicating your phone as another data source sharing payment information, names, emails, IP addresses, and internet history. In order for an app to make the consumer experience more convenient and recommend the nearest charger, location identification is necessary. However, Certified Information Privacy Professionals have reported how this data can be used to pinpoint your location and predict your typical driving route. 

Sharing and collecting this information can make life a lot more convenient and does not seem to pose any imminent risks of harm. However, every public charger is connected to a grid and whether it is a closed or open network, there is always a risk of ransomware attacks, ID fraud, and grid damage. The Cybersecurity and Infrastructure Security Agency defines ransomware as “a form of malware designed to encrypt files on a device, rendering any files and the systems that rely on them unusable. Malicious actors then demand ransom in exchange for decryption.” As described by privacy professionals, closed networks relate to a certain set of manufacturers who have discretion and unrestricted authority to use the data and create profiles; open networks tether multiple manufacturers which decreases each manufacturer’s control but gives more stakeholders access increasing your data’s vulnerability. In other words, while there is not an imminent risk of harm, there is a perpetual risk.

An EV economy

As the Wall Street Journal reported, “Modern vehicles are effectively connected computers on wheels. They’re able to collect a wealth of information via built in apps, sensors, and cameras, which can monitor people both inside and near the vehicle.”

Whether the data originates from the user’s personal device connected to the EV or solely through the charging equipment, the data is ripe for hackers, car manufacturers, insurance companies, and emergency service providers. While such data can help urban planners determine the optimal areas for development and economic profit, it can also inform insurance companies on how to set rates based on driving risk and behavior. More importantly, the Wall Street Journal has recognized that if data brokers obtain and sell the data, even with personal information redacted, movements and habits are individualistic and may provide insight into one’s identity.

Well-intentioned green policy may be getting ahead of itself

President Biden’s goal of boosting U.S. EV production is being achieved through his Made-in-America EV charging network initiative which is supported by the Department of Transportation’s National Electric Vehicle Infrastructure (NEVI) program. NEVI is distributing $5 billion into various EV programs to create a coast-to-coast network of EV chargers and electrify the highway system. However, these good intentions may be putting the cart before the horse since privacy risks of EVs have yet to be adequately and uniformly regulated.

Notably, the Federal Highway Administration (FHWA) has imposed a set of requirements on NEVI fund recipients stated in its “final rule.” The final rule consists of network connectivity requirements that ensure secure payment processing and minimize the amount of personal information that companies may retain. While these efforts seek to safeguard data and promote transparency, the final rule essentially requires merely “appropriate” data protection and gives states the discretion to determine the means. 

California is one state that is addressing the privacy concerns raised by the EV boom. California’s newly approved Electric Vehicle Infrastructure Deployment Plan cites the state’s Senate Bill 327 which requires a manufacturer of a “connected device” to equip the device with reasonable security features based on the nature and function of the device. From a legal perspective, the reference to SB-327 indicates that EV chargers may constitute a “connected device” and therefore warrant reasonable and appropriate security features and protection. 

However, state regulations are not an adequate shield from the broad destruction of a cyberattack. Therefore, some EV charger companies like ChargePoint have adopted internal regulations and earned certifications from the International Organization for Standardization (ISO) based on its comprehensive  information security and cyber-risk management. ChargePoint is a predominant U.S. company that supplies EV charging stations across North America as well as Europe and is therefore subject to Europe’s General Data Protection Regulation (GDPR). The GDPR controls the collection, use, and storage of personal data as well as the conduct of non-EU companies that possess the data of EU residents and citizens. While it seems unlikely that the U.S. will implement a federal law akin to the GDPR, California and ChargePoint may prompt other states and companies to implement regulations that supplement FHWA’s final rule.

Will supporting EVs come at the cost of our privacy?

While it is difficult to encourage people to undertake the risks posed by EVs, even for the sake of curbing carbon emissions; the Earth is a finite resource and without it our privacy is moot. Therefore, people should not be discouraged from purchasing an EV or plugging-into a public charger. Rather, the government and individuals should be compelled to hold corporations accountable for how data is stored and used so that we may plug-in without fear. As the effects of global warming become more apparent, embracing corporate accountability and privacy protection is critical in order to keep up with the EV boom and conserve the Earth.

Talking to Machines – The Legal Implications of ChatGPT

By: Stephanie Ngo

Chat Generative Pre-trained Transformer, known as ChatGPT, was launched on November 30, 2022.  The program has since swept the world by storm with its articulate answers and detailed responses to a multitude of questions. A quick Google Search of “chat gpt” amasses approximately 171 million results. Similarly, in the first five days of launch, more than a million people had signed up to test the chatbot, according to OpenAI’s president, Greg Brockman. But with new technology comes legal issues that require legal solutions. As ChatGPT continues to grow in popularity, it is now more important than ever to discuss how such a smart system could affect the legal field. 

What is Artificial Intelligence? 

Artificial intelligence (AI), per John McCarthy, a world-renowned computer scientist at Stanford University, is “the science and engineering of making intelligent machines, especially intelligent computer programs, that can be used to understand human intelligence.” The first successful AI program was written in 1951 to play a game of checkers, but the idea of “robots” taking on human-like characteristics has been traced back even earlier. Recently, it has been predicted that AI, although prominent now, will permeate the daily lives of individuals by 2025 and seep into various business sectors.  Today, the buzz around AI stems from the fast-growing influx of  emerging technologies, and how AI can be integrated with current technology to innovate products like self-driving cars, electronic medical records, and personal assistants. Many are aware of what “Siri” is, and consumers’ expectations that Siri will soon become all-knowing is what continues to push the field of AI to develop at such fast speeds.

What is ChatGPT? 

ChatGPT is a chatbot that uses a large language model trained by OpenAI. OpenAI is an AI research and deployment company founded in 2015 dedicated to ensuring that artificial intelligence benefits all of humanity. ChatGPT was trained with data from items such as books and other written materials to generate natural and conversational responses, as if a human had written the reply. Chatbots are not a recent invention. In 2019, Salesforce reported that twenty-three percent of service organizations used AI chatbots. In 2021, Salesforce reported the percentage is now closer to thirty-eight percent of organizations, a sixty-seven percent increase since their 2018 report. The effectiveness, however, left many consumers wishing for a faster, smarter way of getting accurate answers.

In comes ChatGPT, which has been hailed as the “best artificial intelligence chatbot ever released to the general public” by technology columnist, Kevin Roose from the New York Times. ChatGPT’s ability to answer extremely convoluted questions, explain scientific concepts, or even debug large amounts of code is indicative of just how far chatbots have advanced since their creation. Prior to ChatGPT, answers from chatbots were taken with a grain of salt because of the inaccurate, roundabout responses that were likely programmed from a template. ChatGPT, while still imperfect and slightly outdated (its knowledge is restricted to information from before 2021), is being used in manners that some argue could impact many different occupations and render certain inventions obsolete.

The Legal Issues with ChatGPT

ChatGPT has widespread applicability, being touted as rivaling Google in its usage. Since the beta launch in November, there have been countless stories from people in various occupations about ChatGPT’s different use cases. Teachers can use ChatGPT to draft quiz questions. Job seekers can use it to draft and revise cover letters and resumes. Doctors have used the chatbot to diagnose a patient, write letters to insurance companies,  and even do certain medical examinations. 

On the other hand, ChatGPT has its downsides. One of the main arguments against ChatGPT is that the chatbot’s responses are so natural that students may use it to shirk their homework or plagiarize. To combat the issue of academic dishonesty and misinformation, OpenAI has begun work on accompanying software and training a classifier to distinguish between AI-written text and human-written text. While not wholly reliable, OpenAI has noted the classifier will become more reliable the longer it is trained.

Another argument that has arisen involves intellectual property issues. Is the material that ChatGPT produces legal to use? In a similar situation, a different artificial intelligence program, Stable Diffusion, was trained to replicate an artist’s style of illustration and create new artwork based upon the user’s prompt. The artist was concerned that the program’s creations would be associated with her name because the training used her artwork.

Because of how new the technology is, the case law addressing this specific issue is limited. In January 2023, Getty Images, a popular stock photo company, commenced legal proceedings against Stability AI, the creators of Stable Diffusion, in the High Court of Justice in London, claiming Stability AI had infringed on intellectual property rights in content owned or represented by Getty Images absent a license and to the detriment of the content creators. A group of artists have also filed a class-action lawsuit against companies with AI art tools, including Stable AI, alleging the violation of rights of millions of artists. Regarding ChatGPT, when asked about any potential legal issues, the chatbot stated that “there should not be any legal issues” as long as the chatbot is used according to the terms and conditions set by the company and with the appropriate permissions and licenses needed, if any. 
Last, but certainly not least, ChatGPT is unable to assess whether the chatbot itself is compliant with the protection of personal data under state privacy laws, as well as the European Union’s General Data Protection Regulation (GDPR). Known by many as the gold-standard of privacy regulations, ChatGPT’s lack of privacy compliance with the GDPR or any privacy laws could have serious consequences if a user feeds ChatGPT sensitive information. OpenAI’s privacy policy does state that the company may collect any communication information that a user communicates with the feature, so it is important for anyone using ChatGPT to pause and think about the impact that sharing information with the chatbot will have before proceeding. As ChatGPT improves and advances, the legal implications are likely to only grow in turn.