Statistical Methods in Cyber Security

Xiaoming Huo Chair
Georgia Institute of Technology, School of Industrial & Systems Engineering
 
Xiaoming Huo Organizer
Georgia Institute of Technology, School of Industrial & Systems Engineering
 
Bin Yu Organizer
University of California at Berkeley
 
Tuesday, Aug 6: 8:30 AM - 10:20 AM
1123 
Invited Paper Session 
Oregon Convention Center 
Room: CC-258 

Applied

Yes

Main Sponsor

Section on Statistical Learning and Data Science

Co Sponsors

IMS
Section on Statistical Computing

Presentations

Enabling Asymptotic Truth Learning in a Network

Consider a network of distributed agents that all want to guess the correct value of some ground truth state. In a sequential order, each agent makes its decision using a single private signal which has a constant probability of error, as well as observations of actions from its network neighbors earlier in the order. We are interested in the question of enabling network-wide asymptotic truth learning -- that in a network of n agents, almost all agents make a correct prediction with probability approaching one as n goes to infinity. In this paper we study carefully crafted decision orders with respect to the graph topology as well as sufficient or necessary conditions for a graph to support such a good ordering.

We first show that on a sparse graph with a random ordering asymptotic truth learning does not happen. We then show a rather modest sufficient condition to enable asymptotic truth learning. With the help of this condition we characterize graphs generated from the Erdös Rényi model and preferential attachment model. In an Erdös Rényi graph, unless the graph is super sparse (with O(n) edges) or super dense (with Ω(n^2) edges), there exists a decision ordering that supports asymptotic truth learning. Similarly any preferential attachment network with a constant number of edges per node can achieve asymptotic truth learning under a carefully designed ordering. We also evaluated a variant of the decision ordering on different network topologies and demonstrated clear effectiveness in improving truth learning over random orderings.  

Speaker

Jie Gao, Rutgers University

Presentation

Speaker

Bo Li, University of Chicago

Adaptive learning in two-player Stackelberg games with application to network security

This paper proposes an adaptive learning approach to solve two-player Stackelberg games with incomplete information. Specifically, the leader lacks knowledge of the follower's cost function, but knows that the follower's response function to the leader's action belongs to a known parametric family with unknown parameters. Our algorithm simultaneously estimates these parameters and optimizes the leader's action. It guarantees that the estimates of the follower's action and the leader's cost converge to their true values within a finite time, with a preselected error bound that can be arbitrarily small. Additionally, the first-order necessary condition for optimality is asymptotically satisfied for the leader's estimated cost. Under persistent excitation conditions, the parameter estimation error remains within a preselected, arbitrarily small bound as well. Even with mismatches between the known parametric family and the follower's actual response function, our algorithm achieves convergence robustly with error bounds proportional to the mismatch size. Simulation examples in the domain of network security illustrate the algorithm's effectiveness and the convergence of results. 

Speaker

Guosong Yang, Rutgers University

A Statistical Method for Safety Alignment of LLMs

As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant security threat to LLM deployment.

This talk introduces a statistical method to ensure the safety alignment of LLMs. We observe that safe and unsafe behaviors exhibited by LLMs differ in the probability distributions of tokens. An unsafe response generated by LLMs corresponds to a distribution where the probabilities of tokens representing harmful contents outweigh those representing harmless responses. We leverage this observation and develop a lightweight safety-aware decoding strategy, SafeDecoding, for safety alignment. SafeDecoding mitigates jailbreak attacks by identifying safety disclaimers and amplifying their token probabilities, while simultaneously attenuating the probabilities of token sequences aligned with jailbreak attacks' objectives with the help of token distribution shifts. We perform extensive experiments on five LLMs using six state-of-the-art jailbreak attacks and four benchmark datasets. Our results show that SafeDecoding significantly reduces the attack success rate and harmfulness of jailbreak attacks without compromising the helpfulness of responses to benign user queries. This work is supported by the NSF AI Institute for Agent-based Cyber Threat Intelligence and Operation (ACTION).
 

Co-Author

Radha Poovendran, University of Washington

Speaker

Zhangchen Xu, University of Washington