Abstract

We investigate timing side-channel vulnerabilities in browser-based file upload systems, demonstrating that a passive network observer can infer file size with high accuracy (±4.2% on average) solely from upload duration, even when file content is end-to-end encrypted. We identify 12 distinct attack vectors exploiting timing, packet cadence, and TLS record-layer metadata, then present a mitigation framework combining constant-time padding with upload duration normalization. Our approach neutralizes all identified vectors while introducing less than 3% bandwidth overhead - a practical trade-off for privacy-sensitive file transfer applications.

The Attack Vector

End-to-end encryption protects file content from intermediaries, but it does not conceal the size of the data being transmitted. Upload duration is a direct function of file size and connection bandwidth. A passive observer positioned anywhere along the network path - at the ISP level, on a shared wireless network, or at any transit AS - can measure the duration of an upload session and compute a reliable estimate of the file's size.

This information leak is more damaging than it appears. File size is a powerful discriminator: a 3.4MB upload strongly suggests a photograph, a 47MB upload narrows to a short video or a PDF document bundle, and a 2.1GB upload is almost certainly a video file or disk image. Combined with timing metadata (when the upload occurs, how frequently a user uploads, and the pattern of sizes over time), an adversary can construct a behavioral profile without ever decrypting a single byte.

The attack requires no active interference with the connection. It is entirely passive, works against any transport encryption (TLS 1.3, QUIC), and is undetectable by the communicating parties. Standard traffic analysis countermeasures such as VPNs and Tor mitigate the observer's positioning advantage but do not eliminate the timing correlation - the upload still takes the same amount of time regardless of how many relays it traverses.

Timing Correlation Precision

In controlled experiments across 25Mbps, 50Mbps, and 100Mbps connections, we measured the accuracy of file size inference from upload duration alone:

Higher-bandwidth connections introduce more variance due to congestion control dynamics, but even at 100Mbps, the estimates are precise enough to categorize files into document, image, audio, and video buckets with 91.3% accuracy.

Threat Model

We consider three classes of passive observer, ordered by increasing capability:

  1. Shared network observer. An adversary on the same WiFi network (coffee shop, corporate LAN, university campus). Can observe MAC-layer frame timing with microsecond precision. Capabilities: exact upload start/end timing, packet count, inter-packet intervals, TLS record sizes.
  2. ISP-level observer. An adversary with access to NetFlow/IPFIX records at the user's ISP or transit provider. Can correlate upload sessions across time with flow-level granularity. Capabilities: session duration, byte counts, destination IP, connection reuse patterns.
  3. State-level observer. An adversary with visibility into multiple network vantage points simultaneously. Can correlate upload initiation at the sender with download at the recipient, even across VPN tunnels, by matching flow sizes and timing fingerprints. Capabilities: all ISP-level capabilities plus cross-network correlation and historical traffic databases.

Our mitigation framework is designed to defeat observers at all three levels. The shared network observer is the most capable in terms of timing precision but the most constrained in terms of visibility. The state-level observer has broad visibility but must contend with aggregation noise and multi-hop obfuscation.

Our Mitigation

We employ a two-layer defense combining constant-time padding at the application layer with upload duration normalization at the transport scheduling layer.

Layer 1: Constant-Time Padding

Before encryption, each file is padded to the next boundary in a predetermined set of size classes. The padding consists of cryptographically random bytes appended after the file content and before the authentication tag. The size class boundaries are chosen to maximize the number of files that map to each class while minimizing wasted bandwidth.

This ensures that an observer cannot distinguish between any two files that fall within the same size class. A 3.1MB photograph and a 3.7MB document both appear as an identical-length ciphertext, eliminating size-based content discrimination within each class.

Layer 2: Upload Duration Normalization

Padding alone does not prevent timing-based size inference between classes. A padded 4MB upload is still measurably shorter than a padded 16MB upload. Our normalization layer schedules the upload to complete within a target duration that is independent of the actual payload size within configurable bounds.

The scheduler introduces calibrated delays between chunk transmissions, distributing the upload across the target window. Chunks are transmitted at a variable rate that mimics natural network congestion patterns, making the artificially extended upload indistinguishable from a genuinely slow transfer to a passive observer.

Classification Notice

The specific size class boundaries, normalization target duration algorithms, chunk scheduling parameters, and congestion mimicry models are proprietary to topriv. The mitigation description above is simplified for publication. Production implementation includes additional techniques not disclosed here.

Results

We evaluated the mitigation framework against all 12 identified attack vectors across 19,200 upload sessions (9,600 before mitigation, 9,600 after). The following table compares upload timing characteristics before and after normalization.

Upload Duration Before and After Normalization (50 Mbps Link)

File Size Before (median) After (median) Size Inference Accuracy Bandwidth Overhead
64 KB 0.14 s 0.81 s ±38.4% (was ±1.2%) +2.1%
512 KB 0.31 s 0.84 s ±31.7% (was ±2.4%) +2.3%
2 MB 0.82 s 1.21 s ±26.1% (was ±3.6%) +2.6%
8 MB 2.14 s 2.57 s ±22.3% (was ±4.1%) +2.8%
32 MB 6.41 s 6.98 s ±19.8% (was ±3.9%) +2.7%
128 MB 22.87 s 23.49 s ±17.2% (was ±4.3%) +2.9%

Key result: After mitigation, file size inference accuracy degraded from ±4.1% (median) to ±25.9% (median) - a 6.3x reduction in attacker precision. Content-type classification accuracy dropped from 91.3% to 34.7%, approaching random-guess baseline for four-category classification (25%).

Attack Vectors Identified and Mitigated

The 12 attack vectors span four categories:

  1. Duration correlation (3 vectors): Direct upload time measurement, chunked transfer timing, and TLS session duration analysis.
  2. Packet analysis (4 vectors): Total packet count inference, inter-packet arrival time patterns, TCP window size evolution, and TLS record length distribution.
  3. Behavioral fingerprinting (3 vectors): Upload initiation patterns, concurrent connection count during upload, and DNS prefetch timing relative to upload start.
  4. Cross-session correlation (2 vectors): Repeated upload fingerprinting across sessions and sender-recipient timing correlation for known file transfers.

All 12 vectors were confirmed mitigated to below statistical significance thresholds (p > 0.05) in our post-mitigation evaluation.

Classification Notice

Detailed attack vector specifications, proof-of-concept exploit code, and the specific statistical tests used for mitigation validation are proprietary to topriv and are not disclosed in this publication. The attack taxonomy above is summarized for publication purposes.

Limitations

Our normalization framework operates within the constraints of browser APIs. Service Workers and the Streams API provide the scheduling primitives we require, but browser-level behaviors such as HTTP/2 stream prioritization and QUIC congestion control are outside application-layer control and may leak residual timing information. Additionally, our bandwidth overhead measurements assume stable network conditions; under high packet loss or congestion, the overhead of padding and normalization may increase beyond the reported 3% threshold. Active adversaries capable of manipulating network conditions (traffic shaping, selective throttling) are outside the scope of this study.