CS499/579 :: Empirical Computer Security
- [Active] November 30: Project presentations in 144 Batcheller Hall from 9:30-11AM on Monday, December 11. Refreshments will be served.
The best way to learn about empirical security research is to do it. You will work on independent research projects in groups of 2 or 3. Part of the project entails 1) a research paper (~9 pages for 3-person team, ~6 pages for 2-person team) and 2) a short presentation (8-10 min) at the end of the quarter.
Unless otherwise specified, all deadlines are at 9PM Pacific Time, and all submissions should be uploaded to Canvas. Only one submission is required per project group. You MUST indicate project member names and ONIDs in the submitted file.
Your proposal should address the following questions:
- What are you trying to do? Did you explain the problem?
- Why is this important?
- What are the security applications of the project?
- How is the problem you are addressing done today? What are the limits?
- What is new about your proposal and why do you think you will be successful?
- What are the ethical considerations?
The project proposal should be a maximum 1 page (at least 10 pt font, single spaced). LaTex / Overleaf is encouraged, but other text editors (e.g., Microsoft Word, Google Docs) are acceptable. Citations from previous works are encouraged and don't count towards your page limit. These proposal guidelines are adapted from UCSD's CSE227 and DARPA's Heilmeier Catechism.
Replicability / reproducability study of prior ACM Internet Measurement Conference papers.
Ideas closely related to my (instructor's) research:
- Measuring degrees of anonymity (e.g., IP, Tor, blockchain, IPFS) in different sytems.
- Security economics of DNS/other domain registration.
- Testing (and identifying bugs in) TLS certificate issuance software.
- Labeling/Categorizing IP hosts via DNS + NLP.
- All password studies assume a unique email login == 1 user. Can we invalidate this assumption?
- Comparison of nameserver dependencies in particular ccTLDs (e.g., .ru)
- Network inference questions in the presence of TLS encryption. What can you infer in an HTTPS session? Can you tell if they had a cookie? Can you make guesses about what page they viewed?
- Evading DGA classifiers with semantically-meaningful encoding of subdomains that mimicks real subdomains
Other ideas (mostly pruned from https://cseweb.ucsd.edu/classes/fa21/cse227-a/projects.html):
- Censorship reproducibility study
- Robo-responding to robo-callers
- Techniques for automatically identifying popular libraries and functions used in unknown binaries (i.e., for aiding in reverse engineering via Ghidra)
- Implementation of "implicit" second factor via smartwatch (i.e., witnessing you typing password using accelerometers) and/or gesture second factor
- Security analysis of robot vaccum with LiDAR (e.g., roborock)
- Retrospective identification of fraudulent DNS names (i.e., names that appear and leave zone files and there is reason to believe fraud)
- Tie FAA flight database with network log data to infer which users are travelling and where they came from
- Develop a system like X-ray to infer how various services track and trade your behavior based on the advertisements that are given to you.
- Can one recover finger typing or PIN button presses from accelerometers in a FitBit or Apple Watch?
- Build proof of concept Light-bulb based covert-channel (microphone in Zigbee lightbulb)
- Video-conferencing filter that obscures identity but looks natural (autotune for faces)
- Identifying individuals in crowd scenes via non-traditional cues (e.g. back of the head recognition)
- Meaningful visualization of security data (e.g., spam, net, etc.)
- Seeing through privacy glass or 3M privacy screens?
- Explore other variants of implicit memory passwords (i.e., where you don't know the password yourself) to see if you can improve training time or recognition time.
- Analyze whether certain authors are more likely to introduce security vulnerabilities; does overall experience matter? experience on a project?
- Is there a difference in security vulnerability density as a function of software age or programming language?
- A security analysis of any interesting device...
- Design an agent that alerts users about security issues (e.g., HTTPS problems) only when they are entering PII and evaluate if that context helps improve their security hygiene.
- Can we replace bad security programming advice online (e.g., StackExchange) to real vulnerabilities in the wild?
- Explore how/if sound can be used to enhance security awareness.
- What about other contextual cues (e.g., suble shaking of window, color shifting, etc... can people be nudged to do the right thing?
- To a study to determine how infrequently negative security advice must occur for it to be taken more seriously? What is the tradeoff in frequency and effectiveness?
- Build system to identify the kinds of information being targeted by different kinds of malware
- Evaluate malware delivery vectors: P2P malware vs web sites vs attachments, etc ... are they all carrying the same malware or different?
- Evaluate time-to-detect for commercial malware
- Build IDA or Ghidra plug-in to locate particular “kind” of code in binary (e.g., AES code, CRC code, packing code, network code, etc.)
- Use NLP to track good/service pricing on underground forums/IRC
- Come up with a technique to infer the profitability of Ransomware
- Do a measurement study of criminal proxy networks
- Predict which code changes will produce software vulnerabilities
- Build classifier to predict machine compromise based on what sites you visit
- Build a system that learns a profile for "normal" kernel memory usage and can alert if memory contents are anomalous
- Detection of Bots in MMPORGs
- Analysis of Taser authentication
- Analysis of on-line poker (fair deal or not?)
- Hardware support for self-destructing data
- Hardware support for information flow tracking
- Build a system that whenever you run an executable from the network, spawns two new VMs, one where you run the program, the other where you didn't and then compare the state changes between to two to decide if something bad has happened and "undo" to the world where you didn't run the program.
- Security analysis of campus power grid
- Repeat Ozment/Schechter’s Milk/Wine study on vulnerability generation w/another system (great study!)
- Are there vulnerabilities in Digital FM radio?
- Attacks against smart batteries (drain beyond ability to recharge or make explode)
- Build an interactive biometric system (e.g., proof of presence via eye-tracking) to prevent simple replay attacks
- Blockchain as a more robust alternative to bulletproof hosting, which hides malicious activity
Project report [Rubric]
The project report should resemble a 6- or 9- page workshop / short paper. Here are some examples:
- 6 or 9 pages maximum (including references), depending on project group size.
- Text should be 10pt font, 2 column format. Table/figure captions and labels can be 12pt font. Section headings can be larger as well. If in doubt, follow the format in the example papers.
- Citations should use IEEE style.
- Feel free to use LaTeX / Overleaf (ACM template style files) or Microsoft Word / Google Docs.
Research presentation [Rubric]
The project presentations should be similar in content to the research paper presentations throughout the semester; however, they should de-emphasize background knowledge explanation and there should be no class interaction. Some additional details:
- Date + time: TBD
- Length: 8 minutes for 2-person groups; 10 minutes for 3-person groups.
- Presenters: Everyone from the group should deliver part of the presentation.