Finding files not managed by Puppet

Published by Taavi Väänänen on July 1, 2023.

With Puppet, it's fairly typical to have a directory where all files not managed by Puppet will be automatically purged, as that helps to ensure consistency between servers. For example, you could do something like this:

# @summary manages the ferm firewall frontend
class ferm () {
    # install package

    # manage main ferm.conf file

    file { '/etc/ferm/ferm.d':
        ensure  => directory,
        owner   => 'root',
        group   => 'adm',
        mode    => '0755',
        recurse => true,
        force   => true,
        purge   => true,
        notify  => Service['ferm'],
    }

    # manage service, etc
}

However, sometimes you end up with a directory that's not managed this way and you want convert it to a fully managed directory. This is a bit risky: there's a possibility that some hosts have stale files in that directory and removing those could break things. And you couldn't use PuppetDB to search for unmanaged files, either, as the unmanaged files are by definition not in the Puppet state stored there.

To solve this, I wrote a tiny Python script that compares a directory on disk with the Puppet state file. On it's own it's not very useful, as it only works with the local system, but it really becomes useful when paired with a tool like Cumin to run it on many servers. Cumin can be integrated with PuppetDB too, so you can run the command on exact set of servers a potential change would affect.

The script

#!/usr/bin/python3
# SPDX-License-Identifier: Apache-2.0
# Copyright (c) 2023 Taavi Väänänen <hi@taavi.wtf>
"""puppet-unmanaged - detect files that are not managed by Puppet

This script determines which files in the given directory are not
managed by Puppet. It is intended to be used to detect stale unmanaged
files when converting directories to purge mode. The script only
interacts with the local system, for a full picture you should run it
on all affected hosts via Cumin.

Usage example:
  $ sudo cumin "C:ferm" "locate-unmanaged /etc/ferm/ferm.d/"
"""
import argparse
from pathlib import Path
from sys import exit

import yaml


def report_constructor(loader, node):
    return loader.construct_mapping(node)


def main() -> int:
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument("directory", type=Path)
    parser.add_argument(
        "--report",
        type=Path,
        default=Path("/var/lib/puppet/state/last_run_report.yaml"),
        help="Puppet report to work on",
    )
    args = parser.parse_args()

    # parse the special report tag as a plain object
    yaml.add_constructor(
        "!ruby/object:Puppet::Transaction::Report",
        report_constructor,
        Loader=yaml.SafeLoader,
    )

    report = yaml.safe_load(args.report.read_text())

    all_managed_paths = [
        Path(resource.get("path", resource["title"]))
        for resource in report["resource_statuses"].values()
        if resource["resource_type"] == "File"
    ]

    for file in sorted(args.directory.glob("**/*")):
        if file in all_managed_paths:
            continue
        print(str(file))

    return 0


if __name__ == "__main__":
    exit(main())

This article is tagged as: puppet

Feedback? Please email any comments to hi@taavi.wtf, or toot them at @taavi@wikis.world.