Electric load forecasting involves gathering and analyzing load data from end-users to help utilities maintain the balance between supply and demand, plan and manage their infrastructure, and price their services. Privacy concerns arise from sharing electric load data. Federated Learning (FL) entails training a global model among several clients by sharing model weights instead of data. Using multiple server aggregation algorithms, we implement several fine-tuning (personalizing) optimizers to improve the local load forecasting accuracy compared to non-personalized FL. We also propose training a Temporal Fusion Transformer (TFT), a type of deep learning model that combines the strengths of transformers and LSTM networks, via FL, which improved load forecasting accuracy and reduced communication and computation costs compared to other common deep learning models. Recommendations are presented regarding several TFT architecture hyperparameters to improve load forecasting accuracy and reduce costs even further. Enforcing and relaxing lockdown rules and changing habits of people caused major uncertainty in load forecasting during the Coronavirus disease of 2019 (COVID-19).To overcome this, we propose two disaster-aware multi-task learning (MTL) models for residential and city levels load forecasting. The proposed MTL models allowed learning from two different datasets with different features, and reduced load forecasting error for both levels compared to a non-MTL approach.