Sarah Scheffler

About me

I am a grad student in the BUSec group working with Prof. Mayank Varia. I am interested in improving the law's understanding of technology, and technology's understanding of the law and society. I believe that the current lack of mutual understanding between these fields is harming society and individuals caught in the crossfire. I hope to help by gaining and spreading an understanding of both fields. I am a studying applied cryptographer, including secure messaging, key derivation, MPC, and hash combiners. I also have a lot of opinions about machine learning, and how you shouldn't use it for things that bear high costs to individuals when the predictions are wrong, and at the very least you should be using algorithmic fairness techniques.

I am the current organizer of the Multi Party Computation Reading Group for BUSec and the Hariri Institute, and am an active member of the Cyber Security, Law, and Society Alliance.

I also made the BU CS department poster template for Beamer. Feel free to email me, make an issue, or make a pull request with complaints or suggestions.

Before I went to Boston University, I majored in computer science and mathematics at Harvey Mudd College. When not working on crypto research, I can often be found playing Dungeons & Dragons.

Current Research

Resilient Password-Based Key Derivation Functions

paper / poster

Older password-based key derivation functions like PBKDF2 rely on repeated iteration of a single hash function to force the attacker to spend more resources. But thanks to Bitcoin, the cost of specialized hardware to do small, repeated functions, has gone down dramatically. Newer PBKDFs like scrypt add memory as a resource that attackers must spend in order to compute efficiently. We extend this resource consumption model to a PBKDF that consumes many resources, like CPU, storage, cache, or chip access in order to correctly derive the key from the password. Paper is forthcoming.

Algorithmic Fairness: Post-Processing Calibrated Classifiers

paper / poster

Question: When your machine learning algorithm is "calibrated" for different protected groups, can this calibrated score be post-processed in a "fair" way? Answer: In general, no. But you can achieve some partial fairness properties (such as equalizing the positive predictive value across groups), or you can defer on some inputs and guarantee good expected fairness properties for the non-deferred outputs. Paper and poster forthcoming.

Older Projects

Dismantling the False Distinction between Traditional Programming and Machine Learning in Lethal Autonomous Weapons


Contrary to my expectations at the start of the project, the type of programming used to create lethal autonomous weapons does not inherently affect their ability to comply with International Humanitarian Law. Traditional programming, machine learning, and artificial intelligence are distinct, overlapping techniques in programming autonomous weapons, and the use of one technique over another should not affect the standard used to determine whether a given lethal autonomous weapon complies with the Law of Armed Conflict. Rather, the same (strict) standards should apply to all lethal autonomous weapons, and their outward performance in accordance with the law should be the sole determinant of legality.

The Unintended Consequences of Email Spam Prevention

paper / website / talk / poster

To combat DNS cache poisoning attacks and exploitation of the DNS as an amplifier in DoS attacks, many recursive DNS resolvers are configured as "closed" and refuse to answer queries made by hosts outside their organization. This work presents a technique to induce DNS queries within an organization, using the organization's email service and the Sender Policy Framework (SPF) email spam-checking mechanism. We use this technique to study closed resolvers, verifying that most closed DNS resolvers have deployed common DNS poisoning defense techniques, but showing that SPF is often deployed in a way that allows an external attacker to cause the organization's resolver to issue numerous DNS queries to a victim IP address by sending a single email to any address within the organization's domain.

Proactively-secure Accumulo with Cryptographic Enforcement

At the MIT Lincoln Laboratory, as assistant research staff, I worked in the Secure and Resilient Systems and Technology group within the Cybersecurity and Information Sciences division to assist in the implementation, testing, and release of a library that adds confidentiality and integrity guarantees to the Accumulo database, protecting it against a malicious server or sysadmin. Earlier in the project, I also implemented Oblivious RAM (Path ORAM) for Accumulo.

Quantifying Latent Fingerprint Quality


As a capstone project at Harvey Mudd College, I worked with a team of four students for the MITRE Corporation on a project to design, implement, and test a system that uses image processing and machine learning techniques to evaluate the suitability of crime scene fingerprint images for identification by Automated Fingerprint Identification Systems.

Statistical Testing of Cryptographic Entropy Sources


As a summer undergraduate research fellow at the National Institute of Standards and Technology (NIST), I worked with Dr. Allen Roginsky in the Computer Security Division to improve NIST's statistical tests for entropy sources for use in cryptographic random number generators. I also made adjustments to the process for generating large primes used in cryptography.