Instead Github: Your own Gitea instance

Hello, everyone,

Microsoft GitHub is known to be a hotbed of controversy. Many large and well-known open source projects promoting data protection and privacy use this platform, where behavioral data of developers is collected in a central place and reused by Microsoft. This makes it possible to draw conclusions about working hours, working methods, the tools and technology used, ongoing projects, side or hobby projects and even customers, work colleagues, vacation times or travel based upon IP addresses.

It’s no coincidence that GitHub has established itself as the top dog alongside GitLab and many other smaller SCMs. The web interface is basically gold standard, and the many tools for bug tracking, wiki, and time tracking are indispensable. But there is another way which I would like to show and inspire you in this tutorial. Use your own gitea instance for your own projects!

It comes with all of the tools listed above (a feature overview and comparison matrix can be found here). Especially worth mentioning is the git-mirroring function of frequently used repos. This allows me to work with other instances in a privacy-savy way without leaving my behavior data everywhere.

Interestingly, gitea is still hosted on GitHub, but they are about to take the final step in this matter. Recently, developers from China have been blocked from accessing GitHub. This is one of the reasons why it’s better to not depend on any company or central platform and to run everything under your control. No matter if you want to run everything in-house or on a rented server in any data center of your choice.

Second interesting thing: Gitea already “knows” Xojo and has matching .gitignore exemptions when you create a new repo based on default template.

As always, I make no claim to general validity or correctness and will put the finishing touches on the project in the course of the coming days and weeks. If you find anything worthy of improvement, please feel free to comment or add here. Have fun reading!

Initial situation:

I can rely on an existing infrastructure and internet connection, which will not be discussed in detail here. Once from the outside to the inside an quick overview:

  • 250 MBit Internet Business Connection with fixed IPv4
  • Fritzbox ISP modem/router in bridge mode
  • Own ipfire firewall with IPS/IDS/ GeoIP Blocking/ VPN
  • Port forwarding to dedicated Apache reverse proxy
  • gitea in Jail on FreeNAS 11.3 under BSDUnix

The next steps will describe how I installed and configured gitea on an FreeNAS within a Jail

(Due to forum restrictions (only 10000 words allowed) I need to split this text in multiple parts)

Installation

In the web interface of FreeNAS the gitea instance is quickly set up under Plugins.

The current gitea 1.11.4 version is used. Basically it’s just a matter of specifying the name and network address. I left them on DHCP because I do a fixed IP assignment after the installation on my ipfire.

After a few minutes you can start the instance with the local IP in your web browser on port 3000 in the /install directory from where the initial configuration can be done and the connection to the pre-installed postgres database is established. The required user and password can be found in the FreeNAS summary sheet, which is displayed after the installation. This is the initial setup, but you will need to adjust the installation further.

Afterwards, register and login to the gitea web interface with the first user. Automatically, this first user becomes an administrator. After that the /install page disappears and becomes 404.

In the user profile under settings, all desired settings can be made at your convenience. My first point of interest is the 2-factor authentication with TOTP…

… as well as the storage of Public-GPG and App Token keys. These ones are later needed when commiting or pulling.

Additionally I create two organizations. An internal one visible only for me and registered users and a public one visible for everyone.

The default values of the instance are not considered secure and privacy friendly yet. To prevent anyone from registering themself, creating own companies and repos or mixing things up in mine, the next step is to go into the configuration file.

The gitea configuration file

This can be done either in the FreeNAS web console under “Jails”, sub-option “Shell”

or better directly via SSH first on the FreeNAS, and from there with

# iocage console gitea

strait into the jail. Since I don’t like vi very much (and never will) my first official act is to install the Midnight Commander:

# pkg install mc

That way the gitea configuration file can be edited much better with mcedit:

# mcedit /usr/local/etc/gitea/conf/app.ini

An overview of the available settings can be found at https://gitea.io/en-us/config-cheat-sheet/. In my case I have made the following adjustments (replace placeholders with your own data, of course):

[code][mailer]
ENABLED = true
HOST = SMTPHOST_PLATZHALTER:25
FROM = EMAIL_PLATZHALTER
USER = SMTPLOGIN_PLATZHALTER
PASSWD = SMTPPASSWORT_PLATZHALTER

[service]
REGISTER_EMAIL_CONFIM = true
ENABLE_NOTIFY_MAIL = true
ALLOW_ONLY_EXTERNAL_REGISTRATION = false
ENABLE_CAPTCHA = false
DEFAULT_KEEP_EMAIL_PRIVATE = true
DEFAULT_ALOW_CREATE:ORGANIZATION = false
NO_REPLY_ADDRESS = EMAIL_PLATZHALTER

[security]
INSTALL_LOCK = true

[picture]
DISABLE_GRAVATAR = true

[other]
SHOW_FOOTER_BRANDING = false
SHOW_FOOTER_VERSION = false
SHOW_FOOTER_TEMPLATE_LOAD_TIME = false[/code]

Since I run the Gitea instance behind a reverse proxy, the ROOT_URL in the [server] section must be replaced with the public URL. This also eliminates the need to set up a certificate. This is all done on the Reverse. If you do not have a reverse proxy yet, you are free to install the webserver of your choice alongside with letsencrypt-bot within this jail though i wouldn’t recommend due security reasons.

All settings are applied when the instance is restarted:

# service gitea restart

Reverse Proxy

I’ll continue with the Apache on my upstream reverse proxy. The required modules proxy, proxy_http and proxy_http2 should be activated with a2enmod before. Here you create a new virtual web for the public server address, fetch a Letsencrypt certificate with Certbot and store it together with secure headers in the /etc/apache2/sites-available/name.conf file.

These headers in detail are:

    Header add Strict-Transport-Security "max-age=63072000;"
    Header set X-Content-Type-Options "nosniff"
    Header set X-Robots-Tag "none"
    Header set Referrer-Policy "no-referrer"
    Header set X-XSS-Protection "1; mode=block"
    Header always edit Set-Cookie (.*) "$1; Secure"

The crucial lines for the reverse are (replace with your internal IPs)

    ProxyPreserveHost On
    ProxyRequests off
    AllowEncodedSlashes NoDecode

    ProxyPass /.well-known !

    ProxyPass / http://192.168.101.106:3000/ nocanon
    ProxyPassReverse / http://192.168.101.106:3000/

Note that the exception for the .well-known directory must come up first, otherwise the certbot is redirected when renewing the cert. After restarting Apache2 you can successfully connect to the gitea instance.

I logged in immediately from the outside

and created a first repo for testing purposes.

Fail2Ban

Then follows the setup of Fail2Ban. Since I use it on the reverse I have to deviate from the documentation here and do this on the Apache2 log files. This offers two advantages: I don’t need to forward an extra HTTP header “X-Real-IP” with the external IP of the visitor and the load on my infrastructure is reduced as well.

Due to the fact that I have the ipfire.org with IDS/IPS and the GeoIP filter and also deactivated the user self-registration at gitea, there is no need to worry. If somebody should really be evil and want to DDOS me. There are still enough possibilities to counteract on the Firewall or with mod_ratelimit or mod_security at the Reverse.

Further adjustments

What follows in the next few days is the customization of the start page, the legal information, the creation of a CSP in the reverse proxy and the monitoring. The gitea Webhooks are particularly interesting and I plan to integrate them with the Data-Analyzer app in my Nextcloud. If you are using git or any other tools like git-tower you can connect and clone as you would do with GitHub, GitLab or anything else. Of course user registrations and commits are only for handpicked persons and those ones need to enable 2FA and app-tokens aswell.

GPG signed commits lifting security and transparency up to a new level:

If some wonder why they can’t access my public repos and instance at all. Well, as mentioned, I use GeoIP filters. I have basically blocked everything outside the EU cause there is no need for a small site and business like mine, that it should be accessible from everywhere in the world.

Everybody else may try to clone the Rock-Paper-Scissor-Lizard-Spock Repo I have once written for fun for a forum thread discussion :wink:

https://git.jakobssystems.de/tomas.jakobs/rock-paper-scissor-lizard-spock

to be continued…

Good morning,

how can you use github with minimal impact on privacy? By working with a mirror in gitea.

The only behavioral data that github receives are the ones from my server, which synchronizes everything in definable intervals (e.g. every 24h) at a fix time. So working habits, work and vacation times, the software I use, the IP I work from, whom I exchange with - all this information will stay private.

Here’s an example from today. Just mirrored an Internet Draft Document:

All issues, Wiki and of course files, branches, Tags etc. are kept in sync with the origin.

The past week I’ve found time to continue working on my instance and lifted the hard restrictions on third-party registrations and now allow anyone to register. Of course only after turning email confirmation on with REGISTER_EMAIL_CONFIRM in config file. But I have prevented own repos for users with MAX_CREATION_LIMIT. And also the public view on repos and activities is limited to logged in users only with REQUIRE_SIGNIN_VIEW.

The HTML template for the footer was completed with an imprint and not needed information was removed. An important privacy related switch is OFFLINE_MODE so all resources won’t come from 3rd parties but only from my own server. Of course stuff like Gravatar, OpenID, Captchas etc. are all turned off. This is intended as default and design. Here an overview of my manually adjusted settings besides the usual defaults, which I don’t show here:

[repository]
MAX_CREATION_LIMIT = 0

[repository-pull-request]
WORK_IN_PROGRESS_PREFIXES = WIP:,[WIP]

[server]
OFFLINE_MODE = true
DISABLE_SSH = true

[service]
ENABLE_CAPTCHA = false
REQUIRE_SIGNIN_VIEW = true
ALLOW_ONLY_EXTERNA_REGISTRATION = false
REGISTER_EMAIL_CONFIRM = true
DEFAULT_KEEP_EMAIL_PRIVATE = true
DEFAULT_ALLOW_CREATE_ORGANIZATION = false

[picture]
DISABLE_GRAVATAR = true
ENABLE_FEDERATED_AVATAR = false

[openid]
ENABLE_OPENID_SIGNUP = false
ENABLE_OPENID_AVATAR = false

[mailer]
SEND_AS_PLAIN_TEXT = true

I’ve also implemented a CSP on my Reverse Proxy. Replace placeholder with your own domain name. I am using Apache so any nginx or caddy user should modify this aswell:

Header add Content-Security-Policy "default-src 'self' 'unsafe-eval' 'unsafe-inline' data: 'self' *.domain.de; worker-src 'self' *.domain.de; frame-ancestors 'self' *.domain.de; img-src 'self' data: 'self' *.domain.de; object-src 'self'; style-src 'self' 'unsafe-inline' *.domain.de"

Why such strict CSP settings you might ask? To prevent 3rd party resources from being loaded. Mainly I noticed MarkUp texts and wiki documents where badges could be misused as trackers:

I am well aware that with such badges you want to communicate a commitment and/or even a counter for something. Call me overcautious. but this happens completely without any legal basis or consensus. I do not want let somebody else know where, when and on which device or in which editor I read a simple markup text file.

The worst thing about it: Unlike a website with an imprint and privacy notices, nobody knows in what context a document is read. it could be within an IDE on locally cloned directory and with SVG you are using potentially scriptable content. Something I don’t want to see on a development machine.

On the next screenshot you can see the effect of an own instance in combination with strict CSP: On the left hand side there is a all linked media on my instance, on the right hand side all the linked media in the original repository. gitea changes all local links and the CSP is doing the rest preventing any badges to call home.

Privacy has been greatly improved and none of my behavioral data is sent to Microsoft/GitHub anymore. The only thing they see is my office server, which checks the mirrored repos for changes in 24h intervals.

I recommend this to every security and privacy aware developer. For me, it is a sales argument for my customers, some of whom appreciate security and confidentiality when working together.

Extend Gitea with Mermaid

Something that has been on my to-do list for quite some time is the integration of Mermaid into my Gitea instance. For those who don’t know Mermaid: This is a markdown extension for fast drawing of UML charts and even Gantt charts. Why Mermaid? Because it runs completely client-side without additional services. The syntax is quite simple and powerful aswell. Once implemented,it looks like this:

No third parties

Unfortunately, the documented default is anything but privacy friendly. The Mermaid packages are supposed to be distributed via the third-party provider unpkg. I have excluded all 3rd parties with a CSP.

For this reason the files themselves have to be hosted. The most difficult part is to find the required ones. Unfortunately this is not obvious from the documentation. Also the Mermaid GitHub-Repo is not very supportive. So I’ve downloaded them manually from https://unpkg.com/browse/mermaid@8.5.1/dist/ and picked only the really needed ones:

  • mermaid.min.js
  • mermaid.min.js.map

Adjustment Reverse Proxy

Since I would like to keep my Gitea clean and since these files are static Javascripts, it makes sense to keep them in a subdirectory on the upstream reverse. This is done with the line

ProxyPass /myplugin/ !

in the sites-available config file of the virtual web server in Apache. After an Apache restart, Mermaid is ready to use. Whether the directory is called /myplugin or something else is up to you. It should just not collide with the existing gitea paths.

Adjustment Gitea Footer template

Last but not least you have to customize the custom/footer.tmpl template in Gitea. Just add these lines there:

[code]{{{if .RequireHighlightJS}}

{{{end}[/code]

With your own domain and the directory you defined before. After this change also restart the Gitea service and you will enjoy the beautifully visualized UML charts, diagrams & Co.

One detail, however, is given to you when writing markdown documents. Of course I fell into this trap myself and wondered for an hour why Mermaid didn’t work for me. In the preview function of the editor only the Markdown is shown. Only after a document is saved, it can be sent to Mermaid.

Here is an example from the screenshot before. You will find more in the Mermaid documentation.

graph LR;
    A[Box mit eckigen Kanten] -->|ein Link Text| B(Box mit runden Kanten)
    B --> C{eine Entscheidung}
    C -->|Eins| D[Ergebnis Eins]
    C -->|Zwei| E[Ergebnis Zwei]

Enjoy!