#873955 RFP: selfspy -- log everything you do on the computer, for statistics/fun etc.

#873955#5
Date:
2017-09-01 14:29:06 UTC
From:
To:
* Package name    : selfspy
* URL             : https://github.com/gurgeh/selfspy
  Upstream Author : David Fendrich (@gurgeh)
* License         : GPLv3

  Selfspy continuously monitors and stores what you are doing on
  your computer. This way, you can get all sorts of nifty statistics
  and reminders on what you have been up to. It is inspired by the
  Quantified Self-movement and Stephen Wolfram's personal key
  logging [0]

  [0] http://blog.stephenwolfram.com/2012/03/the-personal-analytics-of-my-life/


(Similar to arbtt, but seemingly based on keystrokes instead of window
titles, etc.)


Regards,

#873955#10
Date:
2017-09-05 22:46:52 UTC
From:
To:
May I suggest we audit the hell out of the code of tools like this
before they come into Debian? :)

This is basically a spyware tool, with huge privacy implications.

For example, from a simple usability perspective, we may want to make
the "-n" option the default (ie. don't log actual keystrokes, just
timing).

Of course, we should also make sure the data stays on disk... (as
opposed to being sent to a third-party server)

A.

#873955#15
Date:
2017-09-06 06:55:53 UTC
From:
To:
Hi Antoine,

Oh sure. And ones already in Debian I hope!


Regards,

#873955#20
Date:
2017-09-07 14:07:04 UTC
From:
To:
Control: owner -1 anarcat@debian.org
Control: retitle -1 ITP: selfspy -- log everything you do on the computer, for statistics/fun etc.

I'll start with that one since that's the one I'm currently interested
in if you don't mind.

At first look, the program looks well-written. It uses sqlalchemy for
storage in a SQLite database and while it doesn't have any unit tests,
it looks like it has a sound design. I didn't review the cocoa or
Windows code (specifically `sniff_cocoa.py` and `sniff_win.py`), so
the following applies only to the X version.

Entries can be encrypted in the database. The program uses Python's
getpass module to prompt for passwords on the commandline and Tk's
`Entry` element in the GUI, which probably leaks the password all over
the memory space (and may show it to the user). The pinentry program
should be used instead.

But That password is problematic in the first place since you need to
either type it every time you start the program (which is annoying) or
hardcode it on the commandline (which is insecure) or the config file
(which is pointless). A better approach would be to use OpenPGP to
encrypt the database (filed an [issue][openpgp] about this).

 [openpgp]: https://github.com/gurgeh/selfspy/issues/155

Furthermore, Blowfish to encrypt data stored in the database. Because
of its 64-bit blocksize, Blowfish is considered problematic for large
file encryption as
it's [vulnerable to birthday attacks][blowfish]. Furthermore, MD5 is used to
derive a key from the user password, which also shows its age. A
better approach would be to use a more standard SQLite encryption
approach like [SQLcipher][]. An ideal design, in my mind, would be to
have an AES-encrypted SQLite database with a strong key encrypted with
a user password using a proper key derivation function (KDF) like
Argon, scrypt or PBKDF2 (probably in that order). Also filed an
[issue][] about this.

 [blowfish]: https://en.wikipedia.org/wiki/Blowfish_(cipher)#Weakness_and_successors
 [SQLcipher]: https://www.zetetic.net/sqlcipher/
 [issue]: https://github.com/gurgeh/selfspy/issues/159

So that's for the security of the encrypted storage: basically, it's
showing its age, but should be good enough for casual attackers. As I
mentioned before, a better option would be to *not* store keystrokes
text by default and explicitly force the user to enable this by hand
if they really want a keylogger running at all time. I have also made
a [PR][] for that feature.

 [PR]: https://github.com/gurgeh/selfspy/pull/158

The keylogger itself uses what looks like [Python's Xlib][python-xlib] and
relies on the [RECORD][] extension which will stream all keystrokes
for all clients to the application, regardless of whether or not
keylogging is enabled. This will force administrators to enable that
possible security liability on the X server in order for users to use
this app, but that's a fundamental limitation of X more than an issue
with this particular app.

 [python-xlib]: https://pypi.python.org/pypi/python-xlib
 [RECORD]: https://www.x.org/releases/X11R7.6/doc/recordproto/record.html

Finally, while I cannot vouch for the software without a more thorough
review, I can say that I have read through the code and didn't find
any obvious "leakage" out of the SQLite storage. It doesn't look like
the program sends keystrokes on the network or publicly-readable
files.

As for Debian packaging, that should be fairly straightforward: all
dependencies (lockfile, sqlalchemy, keyring, xlib, although the latter
is [missing from requirements][]) are already in Debian and it's a
fairly normal Python program. I have filed a PR
to [remove the Makefile][] to avoid possible debhelper confusion.

I'll be using the program a little more to figure out if there are any
other gotchas and upload if all is well.

 [remove the Makefile]: https://github.com/gurgeh/selfspy/pull/156
 [missing from requirements]: https://github.com/gurgeh/selfspy/pull/157

Thanks for finding that awesome software. :)

A.

#873955#29
Date:
2017-09-07 15:20:34 UTC
From:
To:
Hi Antoine,

Thanks for this, as for your initial review of the package.


Best wishes,

#873955#34
Date:
2017-09-29 09:11:33 UTC
From:
To:
Hi Anarcat

Any update on this? :)


Regards,

#873955#39
Date:
2017-09-29 12:49:34 UTC
From:
To:
No progress at all. I've been using it for a while, was thinking of
running py2deb or whatever that thing is called they demo'd at debconf,
and just be done with it.

But if you're itching to package this, please do it, be my guest. :)

A.

#873955#44
Date:
2017-09-29 13:59:54 UTC
From:
To:
Dear Antoine,

If it's "Just Python", then if you do the initial stuff under the
Debian Python Team I'll be happy to keep and eye on it as well. :)


Best wishes,

#873955#53
Date:
2022-06-18 17:12:33 UTC
From:
To:
I'm kind of abusing this RFP here, but what the heck, I don't know where
else to put this.

I've been relunctant to start using selfspy because of the obvious
privacy and security implications of constantly running a keylogger on
my computer. So I'm looking at alternatives.

There's two things I want this thing to do:

 1. monitor the *number* and *amount* of keystrokes and mouse movement,
    to give an idea of how badly I'm working (too much)

 2. figure out *what* I'm working on, for time tracking purposes like
    billing

Workrave is what I currently use for the former, and it kind of works,
but it's really hard to pull the data out of there. It does nothing for
the latter.

Selfspy does both, but at a tremendous cost: it keeps all keystrokes in
a database! Ouch. That's not a requirement *I* have.

So timetrack is the other option I found:

https://github.com/joshmcguigan/timetrack

It keeps track of which *files* you're working on, using filesystem
monitoring tools. I am not sure it will work for things like "I am
writing to the BTS now" or "I'm wasting time on Hacker News" because
files are not involved there, but I found it was an interesting enough
approach that it was worth mentionning.

But once you step into timetracking territory, there's a *lot* of tools
that do things like that out there...

https://mjasnik.gitlab.io/timekpr-next/ (in Debian)
https://github.com/tagtime/TagTime
https://github.com/kimmobrunfeldt/git-hours
https://timesnapper.com/
https://hackage.haskell.org/package/arbtt (in Debian)
https://activitywatch.net/ (#990173)

The last two are especially interesting, I find. Activity Watch, in
particular, watches windows with all sorts of hooks in web browsers or
editors to figure out what's going on in there. It doesn't seem to keep
track of keystrokes, but maybe that's better left to a tiny little tool
instead.

In fact, I wonder if the kernel's input devices don't already have
counters for that kind of stuff we can use. Surely the kernel counts
keystrokes, right? :)

a.