The biochemistry departmental web servers.
-
There are two web servers: one at biochem.uthscsa.edu,
and one at biochem2.uthscsa.edu
-
Pseudonyms for biochem.uthscsa.edu are biochem.uthscsa.edu
and bioc02.uthscsa.edu
-
Biochem2 has a lower degree of security, allowing ftp and
telnet. It is set up largely for learners. Web pages
of any importance should be set up on biochem, which must be accessed by
ssh and scp.
-
From within the uthscsa.edu domain, just biochem
or biochem2 should do.
-
They are maintained by Borries Demeler: (Demeler@uthscsa.edu).
Members of the department of Biochemistry can arrange for a username and
password suitable for placing their own web page by filling out a web application
for account. These are services of the UTHSCSA
bioinformatics core facility.
-
The web server name is apache.
This is the most common web server. It is part of a project for providing
open (meaning free) software. Their web page is http://www.apache.org/
-
The operating sytem is Linux which is a lot like Unix.
All commands, filenames, directory names, and your password are case
sensitive.
-
For help installing and using secure shell (ssh and scp),
see (http://www.biochem.uthscsa.edu/computer/ssh.html).
Directory structure:
-
Your username is your home directory name.
-
Mine is hs_lab. I'll use my name in many of
the examples; you would substitute your username for hs_lab
-
Whenever you login in by ssh or telnet, expect to start in
your home directory.
-
The command cd ~ always returns you to your home directory.
The command cd / does not. cd / sends you to the machine's
root directory.
-
Note that path designations in unix and linux use a foreslash
(/), not a backslash (\) the way MS-DOS does.
-
Within your home directory is a subdirectory named public_html
-
If public_html has not already been created for you,
create it with the command mkdir public_html
-
All files you want to access by the web server have to be
placed in public_html, or a sub directory of public_html.
Your home page
-
In order for the web server to see any files in your directory,
public_html
must
contain a file named index.html.
-
This is your home page. It typically contains some
information identifying you and a list of links to your other pages.
-
My home page has an internet address called a uniform resource
locator (URL). It is:
http://biochem.uthscsa.edu/~hs_lab
-
On some machines, the server is set so that entering:
http://biochem.uthscsa.edu/~hs_lab/ to a browser
would return a listing of the public_html directory,
and a user could then click on any file to access it. Dr. Demeler
does not have this capability activated. It is generally not recommended
to give this much access.
-
However, if there is a file named file.html in that directory,
http://biochem.uthscsa.edu/~hs_lab/file.html will
retrieve it, even if it isn't linked from index.html or anywhere
else. That is a commonly used form of weak access control, particularly
for files that are under construction.
Secure shell (ssh) access
-
Ssh is a means to log into another computer at the command
line level and do file management, or run programs. It is a more
secure form of telnet.
-
When you first get a username and password, you should make
a ssh connection to biochem to verify that they work. Then
you should use the
passwd command to change your password.
Please note the password policy posted at: (http://www.biochem.uthscsa.edu/computer/password.policy.html).
Weak passwords have allowed hackers to break in before and trash the system,
so please be careful to choose a good password.
-
Later you may use ssh to list the files that have accumulated
in the biochem directory, delete obsolete ones, make subdirectories,
and possibly use a linux text editor to directly edit html files at the
source level.
Creating a home page
-
Using an html editor:
With Netscape Composer,
you just run Netscape, click on <file> then <new> then <blank
page>. Type whatever you'd like on your home page and format it however
you'd like. This editor's commands are accessed from the top toolbar
and the menu bar. Just try some; they are real easy. For example:
John Doe's home page
My pages: (coming soon)
JDoe@uthscsa.edu
Then click on <file> <save as>. This will
save a copy on the hard disk of the computer you are working on.
Save the file by the name index.html. Netscape will ask you
to supply a title and suggest one. Do not use Netscape's suggestion.
Instead enter a phrase that would be informative about what this page is
if it appeared in a book mark list. John Doe's home page would
be good.
You could now send the saved file to your public_html
subdirectory using secure copy (scp). Note that the intrinsic Netscape
"publish" function is an ftp transfer and will not connect to biochem.
-
With Microsoft Word (starting
with Word 6 I think):
To start a new html file from word, click <file>
and <new> and select <web pages> from the list of templates.
Format a page the way you would like it to appear. Word guesses at
a title for you. One feature of word is that <view> <html source>
will display and allow you to directly edit features at the source level
(see below). For example, you could change the title that way.
Then save as html. Use scp to send the file to the web directory.
-
Using a text editor:
Html formatting controls consist of all plain text
characters, so one could directly have entered the file into the public_html
directory with a unix text editor. Alternatively, it could be typed
with any editor, saved as plain text, and then sent by scp. John
Doe's home page would have been entered something like this:
<html>
<title>John Doe's home page</title>
<body>
<H1><center>John Doe's <i>home page</i></center><H1>
<p>
<H2>My pages: <i>(coming soon)</i></H2>
<P>
<P>
<h3>JDoe@UTHSCSA.edu</h3>
</body>
</html>
-
The tags are always surrounded by angle brackets.
-
They are usually in pairs: for example <H2> means start
formatting as 2nd level header; and </H2> means stop formatting as 2nd
level header. The pair of tags is called a "container".
-
The visible part of the page is always surrounded by <body></body>
-
The whole document is always surrounded by <html></html>
-
The title appears before the body surrounded by <title></title>
-
Other tags use above are <i></i> meaning italicize,
and <p> meaning start a new paragraph. Although the </p> tag
is usually left out, it is not wrong to use it.
-
Html tags are case insensitive.
-
Browsers pay no attention to where you put carriage returns,
blank lines, or tabs. People who directly edit the text files (called
the source files) tend to put each new element on a new line and use blank
lines and indention to make the source easy to read.
-
A tutorial with lots of information about html formatting
tags can be found at the
National
Center for Super Computing Applications [http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimerAll.html
].
-
It's quite possible to get by without ever learning any html.
But knowing a little html is often quite convenient for updating a page,
or for adding some feature that the html editor is fighting you about.
-
For any feature you see on someone else's web page, you can
discover how it was formatted by using the <view> <source> feature
of your browser.
-
Both Netscape Composer and Microsoft Word allow you to switch
back and forth between WYSIWYG and source editing modes. In Netscape,
click <tools><HTML tools> to switch to source editing. In Microsoft
Word, click <view> <source>.
Once you get your home page to the public_html directory,
you should be able to see it by entering the URL into the locator box of
your browser. You can see my home page by entering http://biochem.uthscsa.edu/~hs_lab
followed
by <enter>.
With other html editors.
-
There are many html editors (ie., front page, go live, etc.)
that give more extensive capability to insert fancy html functions.
This comes at a cost. The html files produced are often very complicated
and may not transfer easily for editing in another system. Also these
systems try to simplify your interaction with the multiple files produced
by maintaining a local copy of your entire web page system. This
can lead to complications if you intend to edit your web pages from more
than one computer. I personally recommend learning in as simple an environment
as possible. Then you can evaluate if you want the features of these
more elaborate systems.
Editing a web page
There are lots of ways to edit one of your web pages after
you have already posted it. This assumes that you keep the master
copy on the server itself. If you edit the page from more than one
source, you should remember to do a reload from your browser whenever you
retrieve it for further edits. Otherwise, you risk retrieving a copy
from your local cache that is missing some previous edits.
Say you were working on a Windows - PC platform and the
page was hosted on a unix/Linux machine.
-
To edit an existing page, you could
access the page with Netscape, <reload>, click <file> <edit page>
(that automatically runs composer on a version downloaded into your computer)
and make your edits. If you wish to view or change the source directly,
click <tools> <HTML tools>. The edited page now exists in the memory
of your PC, not on the machine hosting the web page. Use <file>
<save as> to save the file as .html and use scp to copy the edited copy
back to your web directory.
-
To edit an existing page, you could access
the file with Microsoft Internet Explorer, <refresh> click <file>
and <edit with ...>. Explorer will suggest the name of an editor
that is installed on your system. You can alter which editor it suggests
in the Internet Explorer options. The following instructions are
for Microsoft Word. Word will allow you to edit the file in WYSIWYG
mode. Or by clicking of <view> <html source> you can switch
to editing at the source level. Save as an .html file and scp back
to the web directory.
-
You could access the file by any browser and directly save
to your hard disk, and then edit at the source level with any text editor.
-
There are a variety of specialized programs for editing web
pages. These range from freeware to very pricey commercial products.
-
You could access the web directory directly by ssh, and use
a text editor to directly edit the file at the source level.
Adding a list of hyperlinks.
Even if you are not going to add the hyperlink at source
level, it helps to know this bit of html to understand what the various
programs are doing. Here is the source to insert a hyperlink to weather.com
for San Antonio weather:
<a href="http://www.weather.com/weather/local/USTX1200">San
Antonio weather</a>
The <a href => </a> tag pair is called an anchor.
This displays as San
Antonio weather where ever the anchor tag appears in the prose.
The URL of the page is hidden in the tag and does not display on the page.
Instead the text between the <a> and </a> tags (in this case San
Antonio weather) is displayed and highlighted to indicate that it is
an active link. When one clicks on San Antonio weather,
the page indicated in the tag is requested by your browser.
If the page is likely to be printed and referred to on
paper, then the actual address of the link is lost. In that case
a better format at the source level would be:
San Antonio weather at
<a href="http://www.weather.com/weather/local/USTX1200">
http://www.weather.com/weather/local/USTX1200</a>
This displays as:
San Antonio weather at http://www.weather.com/weather/local/USTX1200
Now the URL is repeated in the visible part of the text,
so that you could type it into the locator box of a web browser if you
were working from a printed version of the document. I particularly
recommend the latter form for links embedded in e-mail.
For teaching materials, the first form is preferred.
Due to the disabilities act, we are asked to make materials that interact
in as friendly a way as possible with a voice browser ( - a web browser
running a voice synthesizer to read the material for the visually impaired).
In that mode, one would often "scan" a document by using the tab key to
jump from link to link listening to the "visible" part of the link pronounced
by the voice browser. Hence the descriptive phrase rather than the
URL would be a more informative thing to have pronounced. A third
form that retains both functionalities is:
San Antonio
weather at ( http://www.weather.com/weather/local/USTX1200
).
The parentheses around the URL are to avoid confusing the period at the
end of the line with part of the URL. The period forces the voice browser
to pause. Therefore, periods should be used at the end of titles,
headers, list items, etc., even if they are not really sentences.
You can copy links that you locate with your browser into
an html document that you are writing in a very facile way:
-
If you are using Netscape, put the mouse over the link you
want to copy (the link to the page, not the page itself). Click the
right mouse button and select <Copy link location>. Then open
the composer window with the page you are editing and <paste> where
you want the link. It will automatically be formatted with anchor
tags in the second form shown above. If you want the first form,
just overtype the visible text with San Antonio weather. (Be
careful not to highlight the entire hyperlink and start typing, else you'll
delete the tags. Instead, highlight some of it, type San Antonio weather,
then delete the rest of the original text. I created the 3rd form
above by pasting the URL in twice, then renaming the first instance to
San
Antonio weather, and unlinking the second instance by placing the cursor
within the visible part of the hyperlink and clicking <link> <remove
link>. For unlinking or revising the URL of a link, place the cursor
within the link without highlighting any letters and click <link>.
-
Working with Microsoft Internet Explorer, the comparable
function for capturing the link is to right click on the link and select
<copy shortcut>. When you paste an internet address into a word
document, it will automatically have the anchor tags added if you have
the appropriate autoformat setting in the <tool> menu. Whereas
Netscape composer indicates the formatted link very much like the browser,
Word displays it in a nomenclature like this:
{HYPERLINK: }.
Use the preview function to see what the page will actually look like.
Manual entry of a link:
-
In Netscape composer from Communicator 4.7, you first type
the visible part of the hyperlink, then highlight it and click <Link>
from the toolbar. This displays a box in which you will enter the
http: address. There is a bug with version 4.7 such that once a hyperlink
is set, if you put the cursor at the end and try to continue typing normal
text, your new text will be placed inside the </a> tag rather than after
it. If your link gets extended into text that should not be part
of the link, just highlight the offending portion, and select <link>
then <remove link>. In Netscape 6, they changed the method of
manual entry. Now you position the cursor where you want the link,
and then select <Link>. The popup box now has separate places
to enter the URL and the visible part of the link. Netscape 6 was so slow
to load on my equipment that I took it back off.
-
In Microsoft Word, you just type the URL directly into the
text. If autoformatting is set up correctly, the URL will automatically
be formatted with tags.
-
Of course, if you are editing at source level, you just type
the tags and the visible part of the link directly.
Relative links: If the link is to another document
in the same web directory, then the full address isn't needed. Just <a
href="filename.html"> will do. Similarly <a href="files/filename.html">
will direct the web server to look in a subdirectory named files relative
to the directory where the document with the link is found. This
form of reference has the disadvantage that when you copy the file to another
directory or another computer, the links no longer work.
I have noticed that Netscape Composer sometimes alters
the relative links from what you typed in. After publishing the document,
check the links. If they do not work, reload the document with <edit
page> and check the links by putting the cursor within the link without
highlighting any part of it and clicking <link>.
Putting links in a list: One of the most
common formats used in web documents is an unnumbered list. This
gives a outline-like look to the document and allows material of a finer
level of detail to be segreggated from the progression of major points
by use of different levels of indention. You see this format above
whereever there are bullets. The bullets can be removed, but it's
better to leave them there because it helps avoid getting elements inadvertantly
included in a list which can lead to some perplexing formatting problems.
Both Netscape and Microsoft Word have an icon in the tool bar consisting
of 3 lines with bullets for imposing the list format. The also have
icons for changing indention.
At the source level, the formatting tags for a list are:
Different levels of indention are used above
to distinguish the relationship of certain subpages within NCBI's large
bioinformatics site that have been separately bookmarked to allow faster
access to these particular functions. It's a good idea to mark the
top page (their home page) in addition to the specific pages that interest
you. That's because the subpages often get moved making your links
go dead. Presumably by starting at their home page, you'll easily
find out where they moved them. I also recommend including the name
of the organization in or near the link. Then if they change their
URL, you have something to paste into a search engine to find them again.
The source level organization for the segment above would
be
<ul>
<li> Major Bioinformatics Sites</li>
<ul>
<li><a
href = "...">National Center for Bioinformatics</a></li>
<ul>
<li><a href = "...">Entrez</a></li>
<ul>
<li><a href = "...">Nucleotides</a></li>
<li><a href = "...">Proteins</a</li>
</ul>
<li><a href = "...">Blast</a></li>
<ul>
<li><a href = "...">Blast 2 sequences</a></li>
</ul>
<li><a href = "...">Medline</a></li>
</ul>
<li><a
href = "...">Human genome browser</a> at
<a href = "...">The human genome project at UCSD</a></li>
<li><a
href = "...">EMBL</a></li>
<li><a
href = "...">European Bioinformatics Institute</a></li>
<ul>
<li><a href = "...">EBI alignment database</a></li>
</ul>
</ul>
... additional categories
</ul>
Notice how indenting the source file helps keep the <ul>
and </ul> tags properly paired. That only happens when the tags
were directly entered with a text editor. The automated html editors
don't generally bother to do much formatting at the source level.
One of the most common formatting problems is failing to terminate a list
and causing the bottom of the page to be inconsistently formatted.
It's not unusual for complex nested lists to get into a confused state
such that one has to go into the source level to find the missing tag.
To make the list friendlier to the voice browser, use
ordered lists instead of unordered lists. These are indicated by
the icon with numbered lines. The html tags are <ol>. Ordered
lists are automatically numbered. The voice browser reads the numbers,
but makes no note of bullets or of indention. You can change the numbering
to lettering at the source level by adding a type attribute to the <ol>
tag. It would look like this: <ol type=a>. That would cause
letters to be read by the voice browser distinguishing a sublist from its
parent list.
Password protection
NOTE: When I first tried this, it didn't work. That
was because Dr. Demeler had to give me permission at the system administrator
level to be able to use passwords in this way. He indicated that
he would set that as the default for everyone. However, if you discover
that you can access your password-protected page without giving the password,
you should call it to Dr. Demeler's attention.
-
You can apply password protection to the files of a directory,
such that when someone tries to access any file in that directory through
their web browser, they will be asked to enter a username and a password.
In the example below, you'd transmit the same username and password to
a list of intended recipients. If you had different files for different
sets of people, you'd put them in different subdirectories and give each
subdirectory its own username and password combination.
-
You create valid username:password combinations and store
them in your home directory with a program named htpasswd
-
First set the default directory to your home directory.
-
The first time you run htpasswd the syntax is:
htpasswd -c .htpasswd <username>
The htpasswd program is in the /usr/local/sbin directory.
Unless that directory is
in your default path, you will have to type the following
to execute the program
/usr/local/sbin/htpasswd -c .htpasswd <username>
-
Subsequently the program htpasswd moved to /usr/local/apache/bin/
-
A common problem with unix and linux systems is that they
take a lot of updating and maintaining with the consequence that files
often get moved around. Often the system administrators overlook
files like this one which disrupt user operation when they get moved.
One way to look for a file is to set to the root directory with cd /,
then issue the command ls -R > ~/all.dir Ignore all the "permission
denied" messages. This will place a file in your home directory named
all.dir and containing directories of all directories on the machine.
The cd ~, vi all.dir, and use the / command of vi to
search for the filename.
-
The program will prompt for the password to associate with
this username.
-
By naming the password file .htpasswd with a leading
period, it becomes a hidden file. You must use ls -a to see
hidden files.
-
If you use the -c option and the file .htpasswd
already exists, it will be replaced.
-
The contents of the file will be encrypted, so that if someone
gets into your directory, they can't just read all the passwords.
-
Subsequently, you can add more username-password combinations
to this file by:
htpasswd .htpasswd <new username>
-
To apply a username - password combination to a sub directory
place a text file of the following contents in the sub directory and with
the name .htaccess
AuthUserFile /home/user/<your username>/.htpasswd
AuthName Privat
AuthType Basic
<limit GET POST>
require user <the username for the protection of this
directory>
</limit>
-
The first line gives the path and filename of the password
file, where the password corresponding to the username will be looked up.
On a different machine, it might be a different path.
-
The require line lists usernames that will be granted
access to this directory (- there could be more than one).
Last updated 03/31/2003 - Steve Hardies