Lighthouse performs much of its services in such a way as to avoid business interruption,
as would be the case with a large data center based network with a full time staff. Most
of this is accomplished through our remote monitoring, proprietary alert system, and
remote offsite work, which for the most part happens silently in the off hours and goes
unnoticed day to day. We are often making adjustments and repairs prior to anyone noticing
that anything may be going wrong.
As we perform repairs on various problems, we also do maintenance on each device as we
encounter it. For those devices that we havent seen recently, onsite maintenance is
our chance to get a visual and hands on feel for how things are operating and ensure that
things continue to work smoothly and with as few problems as possible.
Onsite maintenance should be performed at least yearly, and more often in some
environments depending on the quality of the electrical supply, temperature and humidity,
and how dusty the environment is. Warehouse and shop environments usually require more
frequent maintenance to ensure that fans continue to rotate and equipment stays
sufficiently cool. Failure of equipment to stay sufficiently cool ultimately results in
slow operation and eventual failure. Hot equipment will also use more electricity.
Maintenance is usually performed on nights and weekends to avoid service interruptions. We
schedule a sufficient number of engineers to complete the work in the allotted time, and
will work on multiple tasks simultaneously in order to keep this service as cost effective
as possible with minimal downtime.
Following, is a list of what we do during a typical onsite maintenance. The list will vary
depending on the size of the network, types of equipment, and configurations of servers.
Note that we apply configuration adjustments and updates based on developed best
practices, established reliability, and what we know historically and through experience
with each manufacturer across our entire customer base not the experiences of a
single site, engineer, or technician.
- Clean out dust
- Check fans for proper operation and sufficient cooling
- Visually inspect for hardware problems such as leaking capacitors and burned circuits
- Check and correct hard drive errors
- Check for common electrical supply issues
- Check for proper ventilation of environment
- Check battery backup units
- Inspect and correct network wiring and configuration
- Check and correct network routers, internet gateways, switches, and hubs
- Update firmware (built-in software)
- Ensure capable equipment is added to our remote monitoring
- Apply Windows and software updates
- Adjust Windows and software settings as needed
- Update antivirus and antispyware software
- Install alternative (safer) web browsers
- Defragment hard drives
- Remove temporary files and file clutter
- Remove known problem causing software
- Remove viruses
- Check for sufficient available memory
- Check, correct, and update backup scheme
- Check for proper data redundancy
- Update DNS settings (computer name to address translation)
- Update DHCP database (network addresses)
- Look for risky user practices which should be avoided
Below is a before and after picture of a CPU heat-sync and cooling fan. Before the system was cleaned out the customer
was reporting problems with the system running slow and randomly locking up and rebooting. Because of the dust build up
the processor was overheating. Cooling is extremely important, especially in the summer months.