10.30.2017

FNBR Performance Update


Performance matters, and while we’ve been steadily chipping away at improving things, we’ve taken a few steps back here and there and generally aren’t where we want to be long term.

From a player’s point of view, the manifestation of performance-related issues are:

Client
  • Low framerate -> increased input latency, large delta between frames
  • Display latency -> increased input latency
  • Hitches (abnormally long frame time) -> frozen display followed by fast-forward
  • Inconsistent framerate -> hard to aim

Server
  • Low player connection update rate / framerate -> increased ping
  • Hitches (abnormally long frame time) -> “rubber banding”

Network
  • Packet loss -> “rubber banding”
  • Connection bandwidth saturation -> “rubber banding”
  • High ping  -> “rubber banding”
  • Packet bursting (especially on wifi) -> “rubber banding”

Where “rubber-banding” means inconsistencies between client and server that result in a jarring correction. This could be movement for other players being jerky, door interactions feeling weird, etc.

There are a ton of different issues resulting in rubber banding, and even more root causes. E.g. our servers might hitch because of issues with our code, other servers we are running on the same machine having a negative impact (noisy neighbor), or issues with the host OS or HW resulting in degradation in performance, but not outright failures (gray failures).

Your connection to our datacenters is first and foremost impacted by our choice of location for datacenters and your ping to them, but there is also lot of variation in ping based on the route your packets take. There is also potential for additional latency and packet loss if you are using wifi.


Here is what we are doing or need to do on the monitoring side:
  • Track client performance for various classes of GPU HW segmented by resolution and graphics settings so we understand the impact of our changes.
  • Track client hitches with some weighting based on impact on player experience.
  • On client track network packets (and time between them) received from server, and do the same on server from client.
  • Track server performance and number of missed frames -- as a first goal, we need to run at a solid 20 Hz.
  • Track aggregate server performance per VM to identify bad machines and cycle them out. 
  • Track ping and packet loss per region and ISP so we can inform and escalate when we detect anomalies.
  • Track situations resulting in wells of despair. E.g. constrained bandwidth plus packet loss resulting in us sending more traffic which in turn makes things worse.


Here is what we need to do on the development side:
  • Improve performance on min spec PC systems (Nvidia GTX 460, Radeon HD 5570, Intel HD 4000). This is an area where we made things worse recently, but took initial steps to correct in v1.8. We’re not going to stop there.
  • Fix GPU hangs on PC and continue work with graphic card vendors such as NVIDIA, AMD, and Intel on improving performance and stability.
  • Improve input latency on consoles. Improvements shipped with v1.8 -- please let us know your thoughts.
  • Continue our push on improving console performance. We track the percentage of missed VSYNCs and want to be at less than 2% of frames (barely) missing it.
  • Reduce hitches during gameplay. We define a hitch as a frame that took more than 60 ms, resulting in an entire frame to be skipped. The goal here is to get to less than one per minute with focus on entirely eliminating hitches over 100 ms.
  • Fix remaining hitches on dedicated servers. E.g. a lot of players jumping late can result in rubber banding for players early on.
  • Optimize server performance of common actions like taking damage.
  • Identify source of hitches that are limited to first hour of releasing an update.
  • Optimize our server and network code to allow sending of player state to all 100 connections per frame. Right now we are updating 25 connections per frame in the lobby and 50 during the game. That means your play experience isn’t where we want it to be till there are 50 players left. This is a major change that is running in parallel with other optimizations.
  • Improve our handling of edge cases that can result in wells of despair.
  • Improve our matchmaking system to dynamically route traffic to data centers within a region based on location. Basically have the ability to optimize for ping without taking away from the ability to play with friends.
  • Hire more smart people that are passionate about these sorts of technical challenges.


Here are the things you can check and potentially address on your end:

Check whether your graphic card drivers are up-to-date, and if not, upgrade to latest version.

We’re currently running our servers in AWS datacenters around the world and you can visit http://www.cloudping.info/ for a quick check on your ping to them.
  • NA - Virginia, Ohio 
  • EU - Frankfurt, London
  • OCE - Sydney
  • BR - Sao Paulo
  • ASIA - Tokyo

Riot Games is hosting www.lagreport.com which is a great source of general information surrounding ISPs. Applicability for Fortnite may vary.

If you’re using wifi and have the ability to try a wired connection, it might be worth a shot to improve latency, reduce packet loss, and also reduce packet bursting.

Your TV settings can impact latency as well and there are various guides available online on how to tweak your settings. Here is an example.

On PC you can use articles like this one to check whether you have any unexpected background processes that drain resources.