Sarah Scheffler

About me

I am a Ph.D. student in the BUsec group working with Prof. Mayank Varia.

I am a studying applied cryptographer, including zero-knowledge proofs, multi-party computation, secure messaging, private set intersection, and hash combiners. My applied cryptography research creates new cryptographic capabilities inspired by the needs of society, law, and policy.

From 2019-2020, I was the organizer of the BUsec Seminar for security, cryptography, and privacy. I am also an active member of the Cyber Security, Law, and Society Alliance.


TurboIKOS: Improved Non-interactive Zero Knowledge with Sublinear Memory
Yaron Gvili, Julie Ha, Sarah Scheffler, Mayank Varia, Ziling Yang, Xinyuan Zhang
ACNS 2021

BooLigero: Improved Sublinear Zero Knowledge Proofs for Boolean Circuits
Yaron Gvili, Sarah Scheffler, Mayank Varia
Financial Crypto 2021

Protecting Cryptography against Self-Incrimination
Sarah Scheffler, Mayank Varia
USENIX Security 2021

Arithmetic Expression Construction
Leo Alcock, Sualeh Asif, Jeffrey Bosboom, Josh Brunner, Charlotte Chen, Erik D. Demaine, Rogers Epstein, Adam Hesterberg, Lior Hirschfeld, William Hu, Jayson Lynch, Sarah Scheffler, Lillian Zhang
ISAAC 2020

Case Study: Disclosure of Indirect Device Fingerprinting in Privacy Policies
Julissa Milligan, Sarah Scheffler, Andrew Sellars, Trishita Tiwari, Ari Trachtenberg, Mayank Varia
STAST 2019

PSPACE-completeness of Pulling Blocks to Reach a Goal
Joshua Ani, Sualeh Asif, Erik D. Demaine, Yevhenii Diomidov, Dylan Hendrickson, Jayson Lynch, Sarah Scheffler, Adam Suhl
JCDCG^3 2019 / JIP 2020

From Soft Classifiers to Hard Decisions: How fair can we be?
Ran Canetti, Aloni Cohen, Nishanth Dikkala, Govind Ramnarayan, Sarah Scheffler, Adam Smith
ACM FAT* 2019

The Unintended Consequences of Email Spam Prevention
Sarah Scheffler, Sean Smith, Yossi Gilad, Sharon Goldberg
PAM 2018

Current Research

Blacklisting Encrypted Messages

Is it possible to have a secure messaging system that maintains all the normal end-to-end encryption secure messaging guarantees, including confidentiality against a malicious server, while also allowing the server to keep a large blacklist of messages that, if sent, can be flagged by the receiver as containing malicious content? Yes, it is possible, but can it be done efficiently enough to use in practice? Can we use it to stop the spread of fake news on end-to-end encrypted platforms? Stay tuned!

Improving Sublinear Zero Knowledge Proofs for Boolean Circuits without Assumptions

By combining existing ZK proof techniques with additional ``tests'' for checking the soundness of additional relations, we create a new ZK proof system for Boolean circuits or circuits in GF(2^k) that achieves a Ligero-style proof that is smaller than others in the same category of proofs that do not rely on the discrete log assumption and that do not have prohibitive prover or verifier runtime.

Older Projects

Protecting Cryptography from Self-Incrimination


Technical analysis of the current state of affairs as to whether various cryptographic objects can be compelled in a government subpoena as a "foregone conclusion" exception to the U.S. 5th Amendment right against self-incrimination.

Resilient Password-Based Key Derivation Functions

paper / poster

Older password-based key derivation functions like PBKDF2 rely on repeated iteration of a single hash function to force the attacker to spend more resources. But thanks to Bitcoin, the cost of specialized hardware to do small, repeated functions, has gone down dramatically. Newer PBKDFs like scrypt add memory as a resource that attackers must spend in order to compute efficiently. We extend this resource consumption model to a PBKDF that consumes many resources, like CPU, storage, cache, or chip access in order to correctly derive the key from the password. Paper is forthcoming.

Privacy Against Inference-Based Device Fingerprinting


Device fingerpinting methods are employed by websites in order to identify unique devices. The older, "traditional" device fingerprinting methods involve direct requests for information from a client's machine: measuring information in the HTTP request header, sending a cookie, or embedding additional web requests (e.g. for invisible pixels). All this information can be identified and blocked from being sent, in order to preserve user privacy. However, the newer wave of device fingerprinting methods use indirect approaches, by asking the user to perform a seemingly irrelevant computation and then gleaning information from the result. These methods are more insidious, more difficult to block, and somewhat sneaky in the sense that even an expert client may not be able to tell that it's happening. These methods are only described in general terms in privacy policies. More research is necessary to determine how to address these methods in a way that respects consumer privacy.

Proposing Safeguards for Government Risk-Assessment Systems


This paper analyzes governmentally-regulated risk assessment systems by evaluating them on three axes: We examine the costs of the systems on individuals, the system holders, and society, we analyze the inputs to the system, and we describe the transparency (or lack thereof) within the systems. Using three case studies—the Unified Passenger system (UPAX), the COMPAS Risk & Need Assessment System, and the FICO score—we develop a standardized set of potential technical requirements to mitigate abuse and ensure individuals are treated fairly while remaining within the constraints levied by the system’s purpose.

Dismantling the False Distinction between Traditional Programming and Machine Learning in Lethal Autonomous Weapons

paper / poster

Contrary to my expectations at the start of the project, the type of programming used to create lethal autonomous weapons does not inherently affect their ability to comply with International Humanitarian Law. Traditional programming, machine learning, and artificial intelligence are distinct, overlapping techniques in programming autonomous weapons, and the use of one technique over another should not affect the standard used to determine whether a given lethal autonomous weapon complies with the Law of Armed Conflict. Rather, the same (strict) standards should apply to all lethal autonomous weapons, and their outward performance in accordance with the law should be the sole determinant of legality.

From Soft Classifiers to Hard Decisions: How Fair Can We Be?

paper / poster

Question: When your machine learning algorithm is "calibrated" for different protected groups, can this calibrated score be post-processed in a "fair" way? Answer: In general, no. But you can achieve some partial fairness properties (such as equalizing the positive predictive value across groups), or you can defer on some inputs and guarantee good expected fairness properties for the non-deferred outputs. Paper and poster forthcoming.

The Unintended Consequences of Email Spam Prevention

paper / website / talk / poster

To combat DNS cache poisoning attacks and exploitation of the DNS as an amplifier in DoS attacks, many recursive DNS resolvers are configured as "closed" and refuse to answer queries made by hosts outside their organization. This work presents a technique to induce DNS queries within an organization, using the organization's email service and the Sender Policy Framework (SPF) email spam-checking mechanism. We use this technique to study closed resolvers, verifying that most closed DNS resolvers have deployed common DNS poisoning defense techniques, but showing that SPF is often deployed in a way that allows an external attacker to cause the organization's resolver to issue numerous DNS queries to a victim IP address by sending a single email to any address within the organization's domain.

Proactively-secure Accumulo with Cryptographic Enforcement


At the MIT Lincoln Laboratory, as assistant research staff, I worked in the Secure and Resilient Systems and Technology group within the Cybersecurity and Information Sciences division to assist in the implementation, testing, and release of a library that adds confidentiality and integrity guarantees to the Accumulo database, protecting it against a malicious server or sysadmin. Earlier in the project, I also implemented Oblivious RAM (Path ORAM) for Accumulo.

Quantifying Latent Fingerprint Quality


As a capstone project at Harvey Mudd College, I worked with a team of four students for the MITRE Corporation on a project to design, implement, and test a system that uses image processing and machine learning techniques to evaluate the suitability of crime scene fingerprint images for identification by Automated Fingerprint Identification Systems.

Statistical Testing of Cryptographic Entropy Sources


As a summer undergraduate research fellow at the National Institute of Standards and Technology (NIST), I worked with Dr. Allen Roginsky in the Computer Security Division to improve NIST's statistical tests for entropy sources for use in cryptographic random number generators. I also made adjustments to the process for generating large primes used in cryptography.