Age | Commit message (Collapse) | Author |
|
|
|
`open` is not as side effect free as I had imagined - i.e. if the flag `O_TRUNC` is passed it truncates the file contents alongside opening the file descriptor. In practice this is done by _emacs_ prior to writing the new file content and as such needs to be intercepted so we can start tracking the file before it is changed.
Interposing `open` required some changes to make the library work without including `fcntl.h`. This header not only defines some of the flags we require to check if a library call actually is able to change files but also defines the `open` library function.
While implementing this change I noticed that the function interpositions implemented in C++ actually need to be declared as `external "C"` so their names do not get wrangled during compilation. I suspect that this was previously implicitly done for e.g. `mmap` and `write` by the included C standard library headers. However this did not work for `open` which is why all function interpositions are now explicitly declared external.
End result: _emacs_ file changes are now tracked correctly.
|
|
|
|
|
|
|
|
i.e. `change` now tries to read a filter definition file matching the current process' name from `/usr/local/share/libChangeLog/filter`.
|
|
|
|
|
|
|
|
The previous interposition logic based on plain usage of `dlsym` analogously to various online examples led to a deadlock during _neovim_ startup. This deadlock was caused by _neovim_'s custom memory allocation library _jemalloc_ because it calls `mmap` during its initialization phase. The problem with calling `mmap` during initialization is that this already leads to executing `libChangeLog`'s `mmap` version whoes static `actual_mmap` function pointer is not initialized at this point in time. This is detected and leads to a call to `dlsym` to remedy this situation. Sadly `dlsym` in turn requires memory allocation using `calloc` which leads us back to initializing _jemalloc_ and as such to a deadlock.
I first saw this as a bug in _jemalloc_ which seemed to be confirmed by a short search in my search engine of choice. This prompted me to create an appropriate [bug report](https://github.com/jemalloc/jemalloc/issues/329) which was dismissed as a problem in the way `mmap` was interposed and not as a bug in the library. Thus it seems to be accepted practice that it is not the responsibility of a custom memory allocator to cater to the initialization needs of other libraries relying on function interposition. This is of course a valid position as the whole issue is a kind of _chicken and egg_ problem where both sides can be argued.
To cut to the chase I was left with the only option of working around this deadlock by adapting `libChangeLog` to call `dlsym` without relying on the wrapped application's memory allocator of choice. The most straight forward way to do this is to provide another custom memory allocator alongside the _payload_ function interpositions of `mmap` and friends.
`init/alloc.cc` implements such a selectively transparent memory allocator that offers a small static buffer for usage in the context of executing `dlsym`.The choice between forwarding memory allocation requests to the wrapped application's allocator and using the static buffer is governed by `init::dlsymContext`. This tiny helper class maintains an `dlsym_level` counter by posing as a scope guard.
The end result of this extension to `libChangeLog` is that it now also works with applications using _jemalloc_ such as _neovim_ and should overall be much more robust during its initialization phase.
|
|
The previous approach of storing them in static variables of the `actual` namespace and initializing them statically did not work out as it is not guaranteed that they are initialized before any interposed function is called.
|
|
|
|
|
|
|
|
All lines starting with `#` are interpreted as comments
|
|
The library may be provided with a new-line separated list of regular expressions via the newly introduced `CHANGE_LOG_IGNORE_PATTERN_PATH`.
Any proposed tracking path that is matched by any of the provided patterns is excluded from change reporting. This functionality uses the Standard's regular expression parsing functionality and as such doesn't introduce any new dependencies. If no file path is provided or the provided file path is unreadable all paths will be tracked.
`change` was adapted to set `CHANGE_LOG_IGNORE_PATTERN_PATH` to `.change_log_ignore` which means that it will by default exclude any patterns provided via this file in the current working directory.
An example for such a file customized for hiding _vim_'s internal write
logic may look as follows:
[0-9]+
[^~]*~
[.*\.viminfo
.*\.swp
Note that this is implemented in a fashion where it is not guaranteed that the full _canonical_ path is checked against the patterns. It remains to be decided if this is enough for all common use cases of this new functionality.
`tracking::PathMatcher` lacks any explicit thread synchronization - according to my current knowledge this should not be necessary as we are only ever reading the private `std::vector<std::regex>` instance.
If invalid regular expressions are provided they are silently ignored.
|
|
|
|
|
|
|
|
|
|
Introduce global static `enabled` variable used to signal the interposed functions to either start tracking or perform plain forwarding without any additional logic.
This is required as e.g. `nvim` crashed when wrapped in `libChangeLog` because it called interposed functions during library initialization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Relying on `diff` for storing the pre-change content did not work out as it opens the file descriptors to soon and only reads the first block of pre-change file content until the remaining content is actually required. This obviously led to problems when tracking actually changing files.
|
|
This obviously leads to synchronizing syscalls that would otherwise happen in parallel. Luckily the goal of this library is to monitor file changes performed by single, user facing applications and as such it doesn't matter if some operations are slightly slower.
|
|
This ensures that the log file is actually accessable - there was a problem where e.g. `change rm test` did not create the log file correctly.
|
|
i.e. enable writing `change vim test` instead of `change "vim test"`.
|
|
|
|
`change` calls the given command wrapped in `libChangeLog` and prints the recorded change log afterwards.
|
|
|
|
|
|
Prevents the same file being tracked multiple times due to relative input paths.
|
|
While the file arguments remain fixed the actual `diff` application and its output style can be changed using the `CHANGE_LOG_DIFF_CMD` environment variable.
|
|
|
|
The library is designed to track the file changes performed by a single process, i.e. there is no need for explicitly stating when the process has exited. Furthermore this reduces the set of function interpositions to the ones handled by the `get_actual_function` method template.
|
|
|
|
The newly introduced `ChangeTracker` class is now keeping track of all tracked file in addition to spawning and managing a corresponding `diff` instance that enables printing pretty _patch-style_ change summaries to the logging target.
This commit introduces `boost-process` and `diff` as dependencies of this library.
|
|
|
|
The pointers to the actual function implementations are now fetched inside the `actual` namespace declared in the `actual_function.h` header.
Fixed source of _noreturn_ related warning during compilation by adding the appropriate flag. Sadly this means that we can not use `std::function` in this context as it doesn't seem to carry these _c-like_ flags.
|
|
|
|
|
|
|
|
|
|
|
|
The `CHANGE_LOG_TARGET` environment variable enables passing the path of an arbitrary target file to the preloaded library.
This may be used to e.g. print the log to a separate `cat` instance and is necessary for logging change events without altering the output of the wrapped process.
|
|
|