-
Notifications
You must be signed in to change notification settings - Fork 723
Add support for WinDivert packet capture engine #2019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #2019 +/- ##
==========================================
- Coverage 83.46% 83.46% -0.01%
==========================================
Files 311 312 +1
Lines 54556 54580 +24
Branches 11491 11472 -19
==========================================
+ Hits 45534 45553 +19
- Misses 7795 7817 +22
+ Partials 1227 1210 -17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Add missing doxygen documentation
| PTF_ASSERT_TRUE(sendURLRequest("www.google.com")); | ||
| // let the capture work for couple of seconds | ||
| totalSleepTime = incSleep(capturedPackets, 2, 7); | ||
| totalSleepTime = incSleep(capturedPackets, 2, 20); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test also failed in CI so I made it more robust
| parser.add_argument( | ||
| "--include-tests", | ||
| "-t", | ||
| type=str, | ||
| nargs="+", | ||
| default=[], | ||
| help="Pcap++ tests to include", | ||
| ) | ||
| parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was needed to support running only WinDivert tests in the windivert job in CI
…and IWinDivertImplementation creates it
| if (result.status != ReceiveResult::Status::Completed) | ||
| { | ||
| m_IsReceiving = false; | ||
| return { result.status, result.error, result.errorCode }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch can get executed if the IO operation is still pending (ERROR_IO_PENDING + event wait timeout expired).
Unless the IO operation is somehow cancelled, which I don't believe we do currently, it is unsafe to return.
Returning would invalidate the overlapped structure held in overlapped as the unique_ptr would go out of scope and delete the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand: in the case where the event timeout expires, we should call ResetEvent()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. That won't help.
The issue is that Win32 overlapped ops are enqueued and stay pending indefinitely, unless manually cancelled. They don't have a built-in timeout. The event wait timing out does not cancel the Async IO. It just terminates the waiting operation (WaitForSingleObject).
This is the current flow that we have in case there are no incoming packets:
sequenceDiagram
actor usr as User
participant device as WinDivert Device
participant handle as WinDivert Handle
participant windivert as WinDivert Library
participant win32 as Windows Runtime
participant win32io as Windows Overlapped IO subsystem
usr->>device: receivePackets
activate device
device->>handle: Make overlapped
activate handle
handle-->>device: Overlapped
deactivate handle
device->>device: receivePacketsInternal
activate device
device->>handle: recvEx
activate handle
handle->>windivert: WinDivertRecvEx
activate windivert
windivert->>win32: DeviceIOControl
activate win32
win32-)win32io: Begin Async operation
activate win32io
note over win32io: The async operation is started.
note over win32io: With no new packet it will stay pending indefinitely.
note over win32: Async IO was started:<br/> Return pending IO result code immediately.
win32-->>windivert: Result Code ERROR_IO_PENDING
deactivate win32
windivert-->>handle: Result code ERROR_IO_PENDING
deactivate windivert
handle-->>device: Result code ERROR_IO_PENDING
deactivate handle
device->>device: Wait on the event.
activate device
note right of device: No packets, timeout will expire.
note right of device: Notice how the Async IO is not cancelled.
device-->>device: Error code: TIMEOUT
deactivate device
device-->>device: ReceiveResult::Status::Timeout
deactivate device
device->>device: Check receive result
device-->>usr: Return result Timeout.
note over device: Overlapped goes out of scope and gets deleted.
deactivate device
win32io->>win32io: Async IO is still running and may attempt to use Overlapped
note left of win32io: Heap-use-after-free: Overlapped
deactivate win32io
The issue is at the final operations. As we exit receivePackets the overlapped structure goes out of scope, but the async IO is still running. We either need to call CancelIOEx to actually cancel the operation or keep the overlapped structure alive until it finishes.
NB: CancelIOEx is a request for cancellation, we still have to fetch the result of Overlapped. It might have succeeded in the timeframe the cancellation request took to process.
PS: Isn't async fun? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked Claude and it suggested adding this to the destructor. Would that be sufficient?
~WinDivertOverlappedWrapper() override
{
// Cancel any pending I/O
CancelIoEx(m_Handle, &m_Overlapped);
// Wait for cancellation to complete
WaitForSingleObject(m_Event, INFINITE);
// Now safe to cleanup
CloseHandle(m_Event);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't care about the result after you call the cancel, yes. That should be fine.
If you do care about the result. The operation can technically still succeed even after you called CancelIoEx depending on scheduling. You don't know until you call GetOverlappedResult on the overlapped operation after the event gets signaled.
| if (result.status != ReceiveResult::Status::Completed) | ||
| { | ||
| return { result.status, result.error, result.errorCode }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: Overlapped, heap use after free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this response/question: #2019 (comment)
| auto result = m_Handle->recvEx(buffer.data(), buffer.size(), batchSize, overlapped); | ||
| if (result != internal::IWinDivertHandle::ErrorIoPending) | ||
| { | ||
| return { ReceiveResult::Status::Failed, "Error receiving packets: " + getErrorString(result), result }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think recvEx can technically succeed synchronously, thus the success state should also be handled as a separate case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've explored and seen, since we're using OVERLAPPED we should always expect the callback from the OVERLAPPED object.
WinDivertRecvEx can indeed work synchronously if we don't pass the OVERLAPPED object, but then it will never timeout, meaning if no packets are received, it will stay stuck indefinitely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Underneeth all the code, WinDivert uses Win32's DeviceIOControl API call.
This is an extract from MS's Overlapped IO reference:
"When a function is called to perform an overlapped operation, the operation might be completed before the function returns. When this happens, the results are handled as if the operation had been performed synchronously. If the operation was not completed, however, the function's return value is FALSE, and the GetLastError function returns ERROR_IO_PENDING."
https://learn.microsoft.com/en-us/windows/win32/sync/synchronization-and-overlapped-input-and-output#:~:text=When%20a%20function%20is%20called,GetLastError%20function%20returns%20ERROR_IO_PENDING.
We can not assume that the overlapped operation will always return IO pending. The overlapped API is more of a "you can do it async" than a "you must do it async."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided overlapped object is always correctly populated, tho. Even if the operation is completed synchronously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what the fix would be - if the result isn't ERROR_IO_PENDING we should treat it as a sync call?
That would complicate the implementation by quite a lot... how common it is to not get ERROR_IO_PENDING as a result? I never saw it in my tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the result isn't ERROR_IO_PENDING we should treat it as a sync call?
As far as I understand it, yes, but with caveats.
- All the "out" parameters on the call are still populated as if a sync call was made.
- The overlapped structure is still correctly populated with the state of the task.
You can still useGetOverlappedResultto fetch the result if you want. - I am unsure if the event gets signaled or not. The docs don't say anything specific about it and GPT said yes. So I would assume it gets signalled.
- There are some other caveats for IOCP usage and completion routines but those aren't relevant here.
How common it is to not get ERROR_IO_PENDING as a result?
Not sure how common it can happen in our case. Probably not very common, but it can still happen by the Win32 Overlapped IO Spec. Generally it relies on the driver having the data ready at the time of call or available with minimal delay or something. The only way to be sure it can't happen is if we go digging in the driver code, which would probably be more tiresome.
| if (result.status != ReceiveResult::Status::Completed) | ||
| { | ||
| m_IsReceiving = false; | ||
| return { result.status, result.error, result.errorCode }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. That won't help.
The issue is that Win32 overlapped ops are enqueued and stay pending indefinitely, unless manually cancelled. They don't have a built-in timeout. The event wait timing out does not cancel the Async IO. It just terminates the waiting operation (WaitForSingleObject).
This is the current flow that we have in case there are no incoming packets:
sequenceDiagram
actor usr as User
participant device as WinDivert Device
participant handle as WinDivert Handle
participant windivert as WinDivert Library
participant win32 as Windows Runtime
participant win32io as Windows Overlapped IO subsystem
usr->>device: receivePackets
activate device
device->>handle: Make overlapped
activate handle
handle-->>device: Overlapped
deactivate handle
device->>device: receivePacketsInternal
activate device
device->>handle: recvEx
activate handle
handle->>windivert: WinDivertRecvEx
activate windivert
windivert->>win32: DeviceIOControl
activate win32
win32-)win32io: Begin Async operation
activate win32io
note over win32io: The async operation is started.
note over win32io: With no new packet it will stay pending indefinitely.
note over win32: Async IO was started:<br/> Return pending IO result code immediately.
win32-->>windivert: Result Code ERROR_IO_PENDING
deactivate win32
windivert-->>handle: Result code ERROR_IO_PENDING
deactivate windivert
handle-->>device: Result code ERROR_IO_PENDING
deactivate handle
device->>device: Wait on the event.
activate device
note right of device: No packets, timeout will expire.
note right of device: Notice how the Async IO is not cancelled.
device-->>device: Error code: TIMEOUT
deactivate device
device-->>device: ReceiveResult::Status::Timeout
deactivate device
device->>device: Check receive result
device-->>usr: Return result Timeout.
note over device: Overlapped goes out of scope and gets deleted.
deactivate device
win32io->>win32io: Async IO is still running and may attempt to use Overlapped
note left of win32io: Heap-use-after-free: Overlapped
deactivate win32io
The issue is at the final operations. As we exit receivePackets the overlapped structure goes out of scope, but the async IO is still running. We either need to call CancelIOEx to actually cancel the operation or keep the overlapped structure alive until it finishes.
NB: CancelIOEx is a request for cancellation, we still have to fetch the result of Overlapped. It might have succeeded in the timeframe the cancellation request took to process.
PS: Isn't async fun? 😄
This PR integrates WinDivert as another packet capture engine in PcapPlusPlus, enabling packet capture/injection on Windows via the WinDivert driver.
What is WinDivert
WinDivert is an open-source Windows library (kernel + user-mode) that allows applications to intercept, modify, drop or inject network packets traversing the Windows network stack. It is designed for use cases such as packet sniffing, firewalling, NAT-/VPN-tunneling, loopback traffic inspection, etc.
Key features include:
Project Links
Testing
This PR includes basic tests for the
WinDivertDevice. However, it also adds a lightweight abstraction over the WinDivert API using internal interfaces. It enables testingWinDivertDevicelogic without the real driver by providing mock implementations. These mock tests aren't implemented in this PR, but can be added later.