Discussion:
[nycphp-talk] Many pages: one script
(too old to reply)
Elliotte Harold
2007-08-05 17:42:41 UTC
Permalink
I'm considering a simple site that I may design in PHP. PHP is probably
the simplest solution except for one thing: it carries a very strong
coupling between pages and scripts. As far as I've ever been able to
tell PHP really, really, really wants there to be a single primary .php
file for each URL that does not contain a query string (though that file
may of course invoke others).

For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.

Is there any *reasonable* way to do this in PHP? The only way I've ever
seen is what WordPress does: use mod_rewrite to redirect all requests
within the hierarchy to a custom dispatcher script that converts actual
hierarchy components into query string variables. I am impressed by this
hack, but it's way too kludgy for me to be comfortable with. For one
thing, I don't want to depend on mod_rewrite if I don't have to.

Surely by now there's a better way? How do I overcome the one file per
URL assumption that PHP makes?
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Edward Potter
2007-08-05 18:04:29 UTC
Permalink
hmmmm, I have never found this to be a problem. Using includes, you
can pull in .php code from anywhere, even pages with a .php extension
may be 99.99% html, with a just a single include('foo.php') in it.
Keeps things super streamlined, and your pages are very readable.

:-) ed
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is probably
the simplest solution except for one thing: it carries a very strong
coupling between pages and scripts. As far as I've ever been able to
tell PHP really, really, really wants there to be a single primary .php
file for each URL that does not contain a query string (though that file
may of course invoke others).
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
Is there any *reasonable* way to do this in PHP? The only way I've ever
seen is what WordPress does: use mod_rewrite to redirect all requests
within the hierarchy to a custom dispatcher script that converts actual
hierarchy components into query string variables. I am impressed by this
hack, but it's way too kludgy for me to be comfortable with. For one
thing, I don't want to depend on mod_rewrite if I don't have to.
Surely by now there's a better way? How do I overcome the one file per
URL assumption that PHP makes?
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
--
the Blog: http://www.utopiaparkway.com
the Karma: http://www.coderswithconscience.com
the Projects: http://flickr.com/photos/86842405 at N00/
the Store: http://astore.amazon.com/httpwwwutopic-20
Elliotte Harold
2007-08-05 18:16:18 UTC
Permalink
Post by Edward Potter
hmmmm, I have never found this to be a problem. Using includes, you
can pull in .php code from anywhere, even pages with a .php extension
may be 99.99% html, with a just a single include('foo.php') in it.
Keeps things super streamlined, and your pages are very readable.
You've got it backwards. I want one script to service many URLs, not
many scripts to service one URL.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
David Krings
2007-08-05 23:45:05 UTC
Permalink
Post by Elliotte Harold
Post by Edward Potter
hmmmm, I have never found this to be a problem. Using includes, you
can pull in .php code from anywhere, even pages with a .php extension
may be 99.99% html, with a just a single include('foo.php') in it.
Keeps things super streamlined, and your pages are very readable.
You've got it backwards. I want one script to service many URLs, not
many scripts to service one URL.
And that is exactly how Edward described it. You include the one script
into the many URLs you want to make use of the script.
I guess that isn't what you are after, it just reads that way.

The other posts indicate to use a gateway script, something that I
thought about using at some point as well. My home server can be
accessed through different URLs and although it is the same server I
want a php script to pick up the request and redirect to the designated
location. This way my wife can do whatever she wants on her portion and
I can do whatever I want on my portion. Since we don't have time to even
start with a decent web site this is a moot point.

Coming back to what you want to accomplish, maybe defining exactly what
you need and giving some examples may be helpful in reducing confusion.

David
Elliotte Harold
2007-08-06 22:40:04 UTC
Permalink
Post by David Krings
Post by Elliotte Harold
Post by Edward Potter
hmmmm, I have never found this to be a problem. Using includes, you
can pull in .php code from anywhere, even pages with a .php extension
may be 99.99% html, with a just a single include('foo.php') in it.
Keeps things super streamlined, and your pages are very readable.
You've got it backwards. I want one script to service many URLs, not
many scripts to service one URL.
And that is exactly how Edward described it. You include the one script
into the many URLs you want to make use of the script.
I guess that isn't what you are after, it just reads that way.
What you are proposing is not one script to service N URLs. It is N+1
scripts to service N URLs. It is still necessary to create N separate
loader scripts and place them at the right locations. That doesn't
scale, and it's an extraordinary waste of resources when a small
percentage of the possible URLs will ever be reached.

The goal here is to avoid having to manually create and maintain
separate files for each URL. One file: many URLs. That's the goal.

Imagine, for example, a site with a separate URL structure for each
user, or a separate URL for each date in history.

The point about this being an Apache problem rather than a PHP problem
is understood, except that Java/Tomcat/mod_jk does seem able to
accomplish what I'm looking for, so the real lack may not be in Apache
but in now PHP connects to Apache.

Overall, though, I suspect all parties (Apache, PHP, Tomcat, Rails.
etc.) are still too mired in circa-1994 models of web servers serving
file systems. For example, once you map /foo to a servlet you can't then
map /foo/bar to something else.

I'm curious if they're any web servers out there that do not start with
the assumption that each URL maps to a file somewhere. What if a web
server were designed to allow all URLs to be delegated to specific
handlers? A file system handler need be only special case, no different
from a database handler or a PHP handler.

RESTful API *design* is fairly easy to do. RESTful API *implementation*
is fairly hard because no servers I've seen provide sufficient
flexibility. Coming up on the Web's 20th anniversary, we still haven't
learned how to take HTTP on its own terms rather than by pretending its
something else we're more familiar with. Revolutions take time. :-)
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
David Krings
2007-08-06 23:23:29 UTC
Permalink
Post by Elliotte Harold
Post by David Krings
Post by Elliotte Harold
Post by Edward Potter
hmmmm, I have never found this to be a problem. Using includes, you
can pull in .php code from anywhere, even pages with a .php extension
may be 99.99% html, with a just a single include('foo.php') in it.
Keeps things super streamlined, and your pages are very readable.
You've got it backwards. I want one script to service many URLs, not
many scripts to service one URL.
And that is exactly how Edward described it. You include the one
script into the many URLs you want to make use of the script.
I guess that isn't what you are after, it just reads that way.
What you are proposing is not one script to service N URLs. It is N+1
scripts to service N URLs. It is still necessary to create N separate
loader scripts and place them at the right locations. That doesn't
scale, and it's an extraordinary waste of resources when a small
percentage of the possible URLs will ever be reached.
I sincerely apologize for helping out. I am convinced that my proposal
would be a suitable solution (you need one script, not more), but
apparently confusion and misunderstanding takes precedence. If I could
I'd eradicate my posts to save you from the disgust that I must have
generated. In any case, I am deeply sorry to have caused you discomfort.
Post by Elliotte Harold
Overall, though, I suspect all parties (Apache, PHP, Tomcat, Rails. etc.) are still too mired in circa-1994 models of web servers serving file systems. For example, once you map /foo to a servlet you can't then map /foo/bar to something else.
Well, then use IIS and ASP if that is a better solution. As you pointed
out, these are "circa-1994" models that apparently work very well in
2007. So much so that almost 2/3 of all internet sites use exactly this
technology. Those folks must all be idiots for using something that so
obviously doesn't work! There isn't much technology around that was
created in the 90s and that still works well for today's needs. I don't
see the big deal if a web server mirrors a file system structure, the
problem is on how to get to the desired file - which can be done with an
arbiter script that gets hit on each access and then redirects to the
desired location.
Since the server just has to know what to serve up it will be quite easy
to structure the site based on unique IDs and use a database table to
retrieve the corresponding paths.
Calling mysite.gov/1234 directs to mysite.gov/foo/bar/index.php and
calling mysite.gov/4321 directs to mysite.gov/bar/foo/index.php. Of
course, you need to add that information to a table or a simple ini file
if no db is available. With a table you got one db connection, one
select query, a few lines error output in case the resource doesn't
exist, and one line for the redirect. Similar approach when using some
authentication. And for the case that each resource is supposed to use
the same code, instrument each resource with an include. I'd use two
independent script files for that, but I guess one could smush that into
one file as well. And with include I mean using the PHP code word
"include()", not copying the code into each resource. And even that
could be automated. I have written self-replicating scripts that worked
out OK unless I wanted to make changes.

I apologize upfront if my comments yet again generated an intellectual
distaste. In that case, I give up and you are on your own.

Have a nice day,

David
Hans Zaunere
2007-08-07 12:42:12 UTC
Permalink
Post by Elliotte Harold
Post by David Krings
Post by Elliotte Harold
Post by Edward Potter
hmmmm, I have never found this to be a problem. Using
includes, you can pull in .php code from anywhere, even pages
with a .php extension may be 99.99% html, with a just a single
include('foo.php') in it. Keeps things super streamlined, and
your pages are very readable.
You've got it backwards. I want one script to service many URLs,
not many scripts to service one URL.
And that is exactly how Edward described it. You include the one
script into the many URLs you want to make use of the script.
I guess that isn't what you are after, it just reads that way.
What you are proposing is not one script to service N URLs. It is N+1
scripts to service N URLs. It is still necessary to create N separate
loader scripts and place them at the right locations. That doesn't
scale, and it's an extraordinary waste of resources when a small
percentage of the possible URLs will ever be reached.
The goal here is to avoid having to manually create and maintain
separate files for each URL. One file: many URLs. That's the goal.
Imagine, for example, a site with a separate URL structure for each
user, or a separate URL for each date in history.
The point about this being an Apache problem rather than a PHP problem
is understood, except that Java/Tomcat/mod_jk does seem able to
Sure they do - because Java requires essentially it's own server to run.
Post by Elliotte Harold
accomplish what I'm looking for, so the real lack may not be in Apache
but in now PHP connects to Apache.
Quite true - mod_perl can do more, but as far as I know, still not as much
as Tomcat since they're linked at the hip.
Post by Elliotte Harold
Overall, though, I suspect all parties (Apache, PHP, Tomcat, Rails.
etc.) are still too mired in circa-1994 models of web servers serving
file systems. For example, once you map /foo to a servlet you can't
then map /foo/bar to something else.
Hmm, interesting...
Post by Elliotte Harold
I'm curious if they're any web servers out there that do not start
with the assumption that each URL maps to a file somewhere. What if a
While it seems like a good idea, it'd probably cause more grief... think of
all those poor images, CSS, PDFs, JS, static HTML, etc. files out there that
we assume get served directly, correctly, statically - and quickly - right
from the filesystem.
Post by Elliotte Harold
web server were designed to allow all URLs to be delegated to specific
handlers? A file system handler need be only special case, no
Well that essentially exists with AddHandler and friends in Apache, and is
essentially the crux of many of the techniques described in this thread.
Post by Elliotte Harold
different from a database handler or a PHP handler.
RESTful API *design* is fairly easy to do. RESTful API
*implementation* is fairly hard because no servers I've seen provide
sufficient flexibility. Coming up on the Web's 20th anniversary, we
still haven't learned how to take HTTP on its own terms rather than
by pretending its something else we're more familiar with.
Revolutions take time. :-)
Agreed - I'm still waiting for XSLT to take us by storm. And I keep that
Javascript turned off in my browser, since no web site should depend on it
being available... right?

Web's 20 ?= Web 2.0 It's just a decimal point away...

H
David Krings
2007-08-07 13:05:22 UTC
Permalink
Post by Hans Zaunere
Agreed - I'm still waiting for XSLT to take us by storm. And I keep that
Javascript turned off in my browser, since no web site should depend on it
being available... right?
Both true. XSLT is indeed an awesome technology. The reason why it
doesn't catch on is that XML and XSLT is designed for machines to read
and not for humans. Just see how difficult it is for many to create
proper HTML!
Post by Hans Zaunere
Web's 20 ?= Web 2.0 It's just a decimal point away...
With broadband being very expensive and about anything Web 2.0 being
created only for profit there will be a long time to go. And also, what
is wrong out of a sudden with fat clients and desktop apps?

David
Elliotte Rusty Harold
2007-08-07 21:15:35 UTC
Permalink
Post by David Krings
Post by Hans Zaunere
Agreed - I'm still waiting for XSLT to take us by storm. And I keep that
Javascript turned off in my browser, since no web site should depend on it
being available... right?
Both true. XSLT is indeed an awesome technology. The reason why it
doesn't catch on is that XML and XSLT is designed for machines to read
and not for humans. Just see how difficult it is for many to create
proper HTML!
There may not be a lot of XSLT on the web yet, but there's more than
you'd think; especially if you get to look behind the curtains. Many
more sites are using it internally than are exposing it publicly.

And in some fields such as publishing XSLT has been an absolute godsend.
It's much less heralded than PHP or Rails, but to me it's a far more
powerful and productive language for the uses for which it's intended.
That is, XSLT improves my productivity when doing XMLish things more
than PHP improves my productivity when doing Webish things. I'm not
saying XSLT is a general purpose web development language like PHP. It's
definitely true that the use cases for XSLT are somewhat more
specialized than the use cases for PHP. I.e. more people want to do
webby things than XML things.

Of course, if you really want to rock, try combining
XQuery+XQueryP+APP+a native XML database. Once the tooling matures a
bit, that's a stack that's going to make all previous web dev frameworks
look like PowerBuilder. Hmm, need a good acronym for that one: LAXQE
perhaps? (Linux+Atom Publishing Protocol+XQuery+eXist) Have to work on
that a bit. :-)

--
Elliotte Rusty Harold
Jon Baer
2007-08-07 21:29:18 UTC
Permalink
Isn't what you described already in some type of existence with the
W3C SPARQL idea ...

http://www.w3.org/2001/sw/DataAccess/

Or do you have an opinion on it?

- Jon
Post by Elliotte Rusty Harold
Post by David Krings
Post by Hans Zaunere
Agreed - I'm still waiting for XSLT to take us by storm. And I keep that
Javascript turned off in my browser, since no web site should depend on it
being available... right?
Both true. XSLT is indeed an awesome technology. The reason why it
doesn't catch on is that XML and XSLT is designed for machines to
read and not for humans. Just see how difficult it is for many to
create proper HTML!
There may not be a lot of XSLT on the web yet, but there's more
than you'd think; especially if you get to look behind the
curtains. Many more sites are using it internally than are exposing
it publicly.
And in some fields such as publishing XSLT has been an absolute
godsend. It's much less heralded than PHP or Rails, but to me it's
a far more powerful and productive language for the uses for which
it's intended. That is, XSLT improves my productivity when doing
XMLish things more than PHP improves my productivity when doing
Webish things. I'm not saying XSLT is a general purpose web
development language like PHP. It's definitely true that the use
cases for XSLT are somewhat more specialized than the use cases for
PHP. I.e. more people want to do webby things than XML things.
Of course, if you really want to rock, try combining XQuery+XQueryP
+APP+a native XML database. Once the tooling matures a bit, that's
a stack that's going to make all previous web dev frameworks look
like PowerBuilder. Hmm, need a good acronym for that one: LAXQE
perhaps? (Linux+Atom Publishing Protocol+XQuery+eXist) Have to work
on that a bit. :-)
--
Elliotte Rusty Harold
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
Elliotte Rusty Harold
2007-08-07 21:30:59 UTC
Permalink
Isn't what you described already in some type of existence with the W3C
SPARQL idea ...
http://www.w3.org/2001/sw/DataAccess/
Or do you have an opinion on it?
That's really something very different, and something I'm very skeptical of.

--
Elliotte
Kenneth Downs
2007-08-08 11:49:38 UTC
Permalink
Post by Elliotte Rusty Harold
Of course, if you really want to rock, try combining
XQuery+XQueryP+APP+a native XML database. Once the tooling matures a
bit, that's a stack that's going to make all previous web dev
frameworks look like PowerBuilder.
Not a chance. There is no such thing as a native XML database and there
never will be because XML is a file format (oops, data format), and an
extremely inefficient one at that. To have a native database you need a
data model. XML uses the hierarchical model and if you're going to
build a native hierarchical database you sure wouldn't use the wasteful
XML format internally to store the data.

And if you insist on using XML just because its so wonderful to use 17
characters to store the state <STATE>NY</STATE>, it will never be able
to compete with even the most immature relational engines for pure
speed. Maybe on example and toy sites, but never for anything that
needs to scale.

If I'm wrong, dinner's on me.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
David Krings
2007-08-08 11:59:35 UTC
Permalink
Post by Kenneth Downs
And if you insist on using XML just because its so wonderful to use 17
characters to store the state <STATE>NY</STATE>, it will never be able
to compete with even the most immature relational engines for pure
speed. Maybe on example and toy sites, but never for anything that
needs to scale.
If I'm wrong, dinner's on me.
Don't worry! Finally someone who sees XML as what it is and that it is
by far not as glorious and the key to world peace as many claim. In the
end XML is an ini file on steroids. And XML without DTD is only half of
the pie....so much for self-defining.

I worked on projects where the project manager insisted on taking all
SQL queries, converting them to XML just so that the database layer can
convert them from XML back to SQL. I asked why not talk SQL directly to
the database. The answer was that we have to have XML in it, just so
that we can be open to third parties. Huh? That was for an embedded
system and there aren't many (any?) db engines around that don't
understand at least some SQL.

I think XML sucks. So now let's see what we can do with XML.

David
Elliotte Harold
2007-08-08 14:10:20 UTC
Permalink
Post by David Krings
Don't worry! Finally someone who sees XML as what it is and that it is
by far not as glorious and the key to world peace as many claim. In the
end XML is an ini file on steroids. And XML without DTD is only half of
the pie....so much for self-defining.
If all you do is INI files, then all you'll use XML for is INI files.
Some of us do a little more than that though.

Roughly 80% of the world's data cannot plausibly be stored in a
relational database. The 20% that does fit there is important enough
that we've spent the last 20 years stuffing it into relational databases
and doing interesting things with it. I'm still doing a lot of that.

But there's a lot more data out there that doesn't look like tables than
does. Much of this data fits very nicely in a native XML database like
Mark Logic or eXist. There's also data that has some tabular parts and
some non-tabular parts. This may work well in a hybrid XML-relational
database like DB2 9.

If your only place to put pegs is a table with square holes, then you're
going to try pound every peg you find into a square hole. However, some
of us have noticed that a lot of the pegs we encounter aren't shaped
like squares, and sometimes we need to buy a different table with
different shaped holes. :-)

Relational databases didn't take the world by storm overnight. XML
databases won't either. But they will be adopted because they do let
people solve problems they have today that they cannot solve with any
other tools.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
David Krings
2007-08-08 14:24:49 UTC
Permalink
Post by Elliotte Harold
Relational databases didn't take the world by storm overnight. XML
databases won't either. But they will be adopted because they do let
people solve problems they have today that they cannot solve with any
other tools.
My problem is that I don't get paid 500k/year for working 10 hours a
week with full benefits and 8 weeks paid vacation. Now, show me how to
fix that with XML databases and dinner is on me, twice even. ;)

The best thing about XML is that it is really is just a flat file and
everything in it has a beginning and an end. I cannot think of anything
that one would want to store in XML that cannot be stored in a db and
that also cannot be stored in a text file with way less overhead.
Examples are welcome.

David
Keith Casey
2007-08-08 14:59:43 UTC
Permalink
Post by David Krings
The best thing about XML is that it is really is just a flat file and
everything in it has a beginning and an end. I cannot think of anything
that one would want to store in XML that cannot be stored in a db and
that also cannot be stored in a text file with way less overhead.
Examples are welcome.
There are a number of things which are a much better fit for XML
rather than a database or a flat file... such as arbitrary depth/width
tree structures.

For example, when I worked at the Library of Congress answering the
Ultimate Geek Question, we were working with such trees in the
digitization process. Handling arbitrary width in a database is basic
but handling arbitrary depth quickly gets you into a table structure
that looks like:

tableName - id, displayOrder, parent_id

While that seems simple, it gets to be quite nasty quite quickly
because of the recursive database queries to pull everything back...
and you need to convert it to the actual tree before you can do much
with it. Using a Lazy Loader improves things a bit, but you're just
spreading out the work.

Alternatively, with the XML, it's already in the tree. Add in a
simple recursive template and storing it as XML (in a db or native
XML) and you can render it in html/whatever in one fell swoop. It's
really kind of nifty when it's done right.

The same applies to work breakdown structures... *cough*dotProject*cough*.

One disclaimer... when I was at the LoC, Elliote's XML Bible was one
of our core resources.

kc
--
D. Keith Casey Jr.
CEO, CaseySoftware, LLC
http://CaseySoftware.com
Elliotte Harold
2007-08-08 15:20:10 UTC
Permalink
Post by David Krings
The best thing about XML is that it is really is just a flat file and
everything in it has a beginning and an end. I cannot think of anything
that one would want to store in XML that cannot be stored in a db and
that also cannot be stored in a text file with way less overhead.
Examples are welcome.
You need to remember we're talking about XML databases that store
collections of XML documents, not simply a single XML document. A single
relational record can be easily stored in a tab-delimited text file with
way less overhead too. But when you have millions of these things that
you have to sort, search, update, backup, specify access permissions,
and so forth, then the database begins to show its worth.

With that in mind, consider:

The Encyclopedia Britannica
The collected publications of O'Reilly Media
The complete work product of Skadden-Arps
The New York Times
The collected works of William Shakespeare and other Elizaebthan dramatists

Then consider that you want to be able to make queries like, "Find all
the paragraphs containing both the words 'Bush' and 'incompetent'" so
you can't just shove everything into a BLOB.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Jeremy Mcentire
2007-08-08 18:50:33 UTC
Permalink
Post by Elliotte Harold
Post by David Krings
The best thing about XML is that it is really is just a flat file
and everything in it has a beginning and an end. I cannot think of
anything that one would want to store in XML that cannot be stored
in a db and that also cannot be stored in a text file with way
less overhead. Examples are welcome.
You need to remember we're talking about XML databases that store
collections of XML documents, not simply a single XML document. A
single relational record can be easily stored in a tab-delimited
text file with way less overhead too. But when you have millions of
these things that you have to sort, search, update, backup, specify
access permissions, and so forth, then the database begins to show
its worth.
The Encyclopedia Britannica
The collected publications of O'Reilly Media
The complete work product of Skadden-Arps
The New York Times
The collected works of William Shakespeare and other Elizaebthan dramatists
Then consider that you want to be able to make queries like, "Find
all the paragraphs containing both the words 'Bush' and
'incompetent'" so you can't just shove everything into a BLOB.
Not to mention that the utility of XML is not simply inherent in it's
being small. No one is claiming that it is small. But, there is
utility there that makes it worth the size. There is always a
tradeoff. To complain about XML because it takes more characters to
store data or lays it out in a manner that isn't the same old RDBM
isn't valid. I'd entertain it if I thought you wrote code in a
binary format the machine understands natively. I don't think you
do. All that wasted white-space... for shame.

*Grin* Seriously. With any abstraction comes overhead and a loss of
flexibility or power. XML doesn't do everything perfectly. But, it
fits a lot of things well-enough. *Shudder* I sound like an
engineer now rather than a mathematician. Yet, it is a valid
argument, I suppose. I've heard it enough times now to repeat it.

Jeremy
Elliotte Harold
2007-08-09 00:31:45 UTC
Permalink
Post by Jeremy Mcentire
Not to mention that the utility of XML is not simply inherent in it's
being small. No one is claiming that it is small. But, there is
utility there that makes it worth the size. There is always a
tradeoff. To complain about XML because it takes more characters to
store data or lays it out in a manner that isn't the same old RDBM isn't
valid. I'd entertain it if I thought you wrote code in a binary format
the machine understands natively. I don't think you do. All that
wasted white-space... for shame.
Of course, one advantage of an XML database is that there are no rules
about what the actual disk format looks like. What matters is what goes
ion and what comes out. Databases are free to make optimizations for
both storage space and speed, and in practice most do. The disk format
is no more relevant to the XQuery data model than it is to the SQL data
model. Storing a million documents in an XML database may well take a
lot less space than storing them in individual files.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Kenneth Downs
2007-08-08 20:42:04 UTC
Permalink
Post by Elliotte Harold
Post by David Krings
The best thing about XML is that it is really is just a flat file and
everything in it has a beginning and an end. I cannot think of
anything that one would want to store in XML that cannot be stored in
a db and that also cannot be stored in a text file with way less
overhead. Examples are welcome.
You need to remember we're talking about XML databases that store
collections of XML documents, not simply a single XML document. A
single relational record can be easily stored in a tab-delimited text
file with way less overhead too. But when you have millions of these
things that you have to sort, search, update, backup, specify access
permissions, and so forth, then the database begins to show its worth.
The Encyclopedia Britannica
The collected publications of O'Reilly Media
The complete work product of Skadden-Arps
The New York Times
The collected works of William Shakespeare and other Elizaebthan dramatists
Then consider that you want to be able to make queries like, "Find all
the paragraphs containing both the words 'Bush' and 'incompetent'" so
you can't just shove everything into a BLOB.
Two words: text search.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Elliotte Harold
2007-08-09 00:38:41 UTC
Permalink
Post by Kenneth Downs
Post by Elliotte Harold
Then consider that you want to be able to make queries like, "Find all
the paragraphs containing both the words 'Bush' and 'incompetent'" so
you can't just shove everything into a BLOB.
Two words: text search.
Nope, not the same thing at all.

Index engines like FAST and Lucene can do part of this (though they
can't really take advantage of the structure of the documents they
index). However those are *non-relational* systems. Of course if you
want to search web-size collections, relational databases just can't
handle it. Index engines are the only proven technology that can.

Mark Logic claims their native XML database can search web size
collections too, but I remain to be convinced of that point.

Some relational databases have added non-relational, fulltext search
extensions to their products just as some have added non-relational XML
extensions. These are adequate for simple uses, if you don't push them
too hard. However they are completely incapable of carrying out queries
like, "Give me the title and first paragraph of every chapter of this
book" (something Safari routinely does) because they don't see the
structure of a document, only the text.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
csnyder
2007-08-09 14:37:53 UTC
Permalink
Post by Elliotte Harold
Some relational databases have added non-relational, fulltext search
extensions to their products just as some have added non-relational XML
extensions. These are adequate for simple uses, if you don't push them
too hard. However they are completely incapable of carrying out queries
like, "Give me the title and first paragraph of every chapter of this
book" (something Safari routinely does) because they don't see the
structure of a document, only the text.
I'm glad we have multiple tools to bring to bear on this kind of
problem, because I worry about the performance implications of
querying an XML database for the average price of those books, or
performing an operation that adds another field (tag?) to each book's
"record".

If it's not too much trouble, could you give us some other use cases
for an XML database? Because title and first paragraph, if that's
something a system "routinely does" could easily be stored as
relational data at the time of import.
--
Chris Snyder
http://chxo.com/
Kenneth Downs
2007-08-09 15:18:25 UTC
Permalink
Post by csnyder
Post by Elliotte Harold
Some relational databases have added non-relational, fulltext search
extensions to their products just as some have added non-relational XML
extensions. These are adequate for simple uses, if you don't push them
too hard. However they are completely incapable of carrying out queries
like, "Give me the title and first paragraph of every chapter of this
book" (something Safari routinely does) because they don't see the
structure of a document, only the text.
Select title
,SUBSTRING(text ...insert regexp here...)
from chapters
where book_name = 'XML in a Nutshell'

Rusty, you appear to be arguing from ignorance, very unusual coming from
you.

The true difference between us in this argument is that I understand
that I have a prejudice for relational over hierarchical, based on my
knowledge and use of both, and based on judgment calls as to how to get
through the day. I daresay however that you are promoting a religious
favoring of XML w/o a working knowledge of the alternatives.

You simply cannot defend a file format as a foundation for frameworks
and databases. The best you can do is defend the model, such as the
hierarchical model.

Going further, you cannot defend a file format as a foundation for
anything based on how it handles large text (or binary) fields. There
are three issues here:

-> Data model, hierarchical vs. relational.
-> File format, XML vs YAML or JSON or any other format you like
-> Handling of large text (and binary) columns.

Finally, if we can all admit that XML is just a file format, then the
entire framework crumbles as soon as somebody comes up with a better
one, because let's admit it, XML is just about the worst you're going to
find.

In conclusion, the examples you provide appear to give advantage to XML
because tools exist to handle data that has been buried in opaque
formats and poorly defined structures. If the data had been structured
properly in the first place and put into formats that were not so
opaque, using (pardon me for saying) a *real* database, designed on
solid principles, the examples you give become child's play.
Post by csnyder
I'm glad we have multiple tools to bring to bear on this kind of
problem, because I worry about the performance implications of
querying an XML database for the average price of those books, or
performing an operation that adds another field (tag?) to each book's
"record".
If it's not too much trouble, could you give us some other use cases
for an XML database? Because title and first paragraph, if that's
something a system "routinely does" could easily be stored as
relational data at the time of import.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070809/436e8d0c/attachment.html>
Elliotte Harold
2007-08-12 02:57:17 UTC
Permalink
Post by Kenneth Downs
Select title
,SUBSTRING(text ...insert regexp here...)
from chapters
where book_name = 'XML in a Nutshell'
Regexps can't do that though. Regular expression are an insufficiently
powerful tool for processing XML. Trying to do that is just a world of
pain.
Post by Kenneth Downs
Rusty, you appear to be arguing from ignorance, very unusual coming from
you.
Funny how you confuse different experiences with ignorance. Have you
ever worked in publishing? Or in library science? Or on anything that
operates at web scale like Yahoo or Google? There are many use cases
where a couple of months of hard labor will rapidly disabuse anyone of
the belief that relational databases are the one true solution to all
problems. Your career just happens not to have taken you down those
paths yet.
Post by Kenneth Downs
The true difference between us in this argument is that I understand
that I have a prejudice for relational over hierarchical, based on my
knowledge and use of both, and based on judgment calls as to how to get
through the day. I daresay however that you are promoting a religious
favoring of XML w/o a working knowledge of the alternatives.
Ken, you know me. Do you really think I don't know the relational model
or what it's good for? I use relational databases all the time, and I'm
using them now. However unlike you I've hit their limits. While I'm sure
many people can profitably spend their life doing nothing but relational
databases, I happen to be working on applications where neither the
relational model nor the actual SQL databases out there can come close
to managing my data. I've never said that all applications should use
XML databases or other non-relational systems, You keep trying to put
those words into my mouth. I do say that some applications, especially
in publishing and web publishing, do not fit the relational model well
and can better served by XML databases.
Post by Kenneth Downs
You simply cannot defend a file format as a foundation for frameworks
and databases. The best you can do is defend the model, such as the
hierarchical model.
XML is not a file format. We've been down this road before. A native XML
database is no more based ona file format than MySQL is based on tab
delimited text.
Post by Kenneth Downs
Going further, you cannot defend a file format as a foundation for
anything based on how it handles large text (or binary) fields. There
-> Data model, hierarchical vs. relational.
-> File format, XML vs YAML or JSON or any other format you like
-> Handling of large text (and binary) columns.
Finally, if we can all admit that XML is just a file format, then the
entire framework crumbles as soon as somebody comes up with a better
one, because let's admit it, XML is just about the worst you're going to
find.
Troll. Troll. Troll.
Post by Kenneth Downs
In conclusion, the examples you provide appear to give advantage to XML
because tools exist to handle data that has been buried in opaque
formats and poorly defined structures. If the data had been structured
properly in the first place and put into formats that were not so
opaque, using (pardon me for saying) a *real* database, designed on
solid principles, the examples you give become child's play.
LOL. Seriously, try storing a book or an encyclopedia in a relational
database with anything approximating 1NF, not even 2NF. Then try and
make it perform adequately.

Not all data fits neatly into tables.
Post by Kenneth Downs
Post by csnyder
I'm glad we have multiple tools to bring to bear on this kind of
problem, because I worry about the performance implications of
querying an XML database for the average price of those books, or
performing an operation that adds another field (tag?) to each book's
"record".
Average prices, or adding a field, can be done pretty fast. I don't know
if it's as fast as oracle or MySQL. I don't much care. Sales systems are
exactly the sort of apps that relational databases fit well. But
actually publishing the books? That's a very different story.
Post by Kenneth Downs
Post by csnyder
If it's not too much trouble, could you give us some other use cases
for an XML database? Because title and first paragraph, if that's
something a system "routinely does" could easily be stored as
relational data at the time of import.
Just surf around Safari sometime. Think about what it's doing. Then try
to imagine doing that on top of a relational database.

Think about combining individual chapters, sections, and even smaller
divisions to make new on-off books like Safari U does. Consider the
generation of tables of contents and indexes for these books.

Closer to home, think about a blogging system or a content management
system. Now imagine what you could do if the page structure were
actually queryable, and not just an opaque blob in MySQL somewhere.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Josh McCormack
2007-08-12 03:27:05 UTC
Permalink
On 8/11/07, Elliotte Harold <elharo at metalab.unc.edu> wrote:
<snip>
Post by Elliotte Harold
Closer to home, think about a blogging system or a content management
system. Now imagine what you could do if the page structure were
actually queryable, and not just an opaque blob in MySQL somewhere.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Do you have any recommended reading on XML CMS? Do you know of any
that are open source and in a useful state?
--
Josh McCormack
Owner, InteractiveQA
Web testing & development
http://www.interactiveqa.com
917.620.4902
Elliotte Harold
2007-08-13 23:36:00 UTC
Permalink
Post by Josh McCormack
Do you have any recommended reading on XML CMS? Do you know of any
that are open source and in a useful state?
Several people asked me this so rather than responding individually I
just wrote up some thogughts and put them here:

http://cafe.elharo.com/xml/the-state-of-native-xml-databases/

The final word on this subject has not yet been written, but I think
this is a decent summary of what's available in the native XML DB space
as of August, 2007.

Roughly I think we're where SQL was in 1995: some good payware products
and some iffy but promising open source options. I expect the open
source options will improve into production worthy systems with time,
just as MySQL and PostgresQL did over the last decade.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Steve Manes
2007-08-12 04:17:04 UTC
Permalink
Post by Elliotte Harold
Closer to home, think about a blogging system or a content management
system. Now imagine what you could do if the page structure were
actually queryable, and not just an opaque blob in MySQL somewhere.
This is a fascinating discussion. I can see how an NXD might be a very
good fit for medical information systems where a logic encounter record
might encompass dozens of normalized relational model tables. What are
some of the stable, production-worthy open source servers out there?
Kenneth Downs
2007-08-29 11:51:23 UTC
Permalink
Post by Elliotte Harold
Post by Kenneth Downs
Select title
,SUBSTRING(text ...insert regexp here...)
from chapters
where book_name = 'XML in a Nutshell'
Regexps can't do that though. Regular expression are an insufficiently
powerful tool for processing XML. Trying to do that is just a world of
pain.
????

The example shows a query of a table, not XML. The purpose is to
demonstrate with a quick snippet that all examples of a supposed
indispensable need for the "XML Database" stem from an ignorance of the
abilities of other tools.

Say that you prefer XML, say that you like it, say that you are used to
using it, but don't say that it is a fundamental requirement of the data
itself because it just ain't so.
Post by Elliotte Harold
Post by Kenneth Downs
Rusty, you appear to be arguing from ignorance, very unusual coming
from you.
Funny how you confuse different experiences with ignorance. Have you
ever worked in publishing? Or in library science? Or on anything that
operates at web scale like Yahoo or Google? There are many use cases
where a couple of months of hard labor will rapidly disabuse anyone of
the belief that relational databases are the one true solution to all
problems. Your career just happens not to have taken you down those
paths yet.
My observation on your arguments stems from your repeatedly ignoring
obvious examples of where tables do just fine to store data, and the
claim that 80% of the world's apps need an XML database.

If you have gotten used to using XML for text, then say so. If you like
it, then say so. Don't say it is the only tool available because it is
not. It has many very serious drawbacks, verbosity being the very
first, not to mention the confounding of structure and implementation,
encouraging the illusion of "structureless" data, and so on.
Post by Elliotte Harold
Post by Kenneth Downs
The true difference between us in this argument is that I understand
that I have a prejudice for relational over hierarchical, based on my
knowledge and use of both, and based on judgment calls as to how to
get through the day. I daresay however that you are promoting a
religious favoring of XML w/o a working knowledge of the alternatives.
Ken, you know me. Do you really think I don't know the relational
model or what it's good for? I use relational databases all the time,
and I'm using them now. However unlike you I've hit their limits.
While I'm sure many people can profitably spend their life doing
nothing but relational databases, I happen to be working on
applications where neither the relational model nor the actual SQL
databases out there can come close to managing my data. I've never
said that all applications should use XML databases or other
non-relational systems, You keep trying to put those words into my
mouth. I do say that some applications, especially in publishing and
web publishing, do not fit the relational model well and can better
served by XML databases.
I do know you, and that is why I was struck by your pro-XML stance for
"80% of applications", in which you must either be ignorant of what most
applications really need, or what modern RDBMS's can do, or both.

Forget about EF Codd and the relational model for a moment, lets just
look at the real products that have come along, the table-based servers
we call RDBMS's. These have all solved the very basic issues of data
storage. Most of their power comes from so-called "ACID" compliance,
the ability to allow multiple simultaneous users to access a data store
with assurances of predictable behavior. Your XML databases must solve
these same issues.

What about security? The modern RDBMS defines security on all objects.
Your XML databases will have to provide the ability to define security
on the complete tree. (By the way, I'm sure they'll get there, just keep
reading).

But there is one aspect of the relational model where XML, as a format,
takes a huge leap backward. Codd realized the incredible productivity
gains that could be had if a programmer could access data by name and
not worry about its internal storage structure. He separated the
implementation from the interface. XML, as a format (file, data,
whatever), confounds these two. It is a verbose format for hierarchical
data. There are better formats for nearly all uses.

Here's the clincher. Let's say the XML database grows up and has all of
these things. On this day the only thing it will have in common with
XML is a hierarchical model, the XML format itself will be the first to
go. The ability to accept XQuery statements will be a historical
footnote, and people will end up hating XQuery as much as they hate SQL
(everybody's least favorite part of the RDBMS world). These databases
will end up supporting output formats as YAML, JSON, and others, and
probably inputs as well. There is just not a lot in the XML format that
really makes up data storage.

We can thank XML for making us conscious of the ubiquitous need for
hierarchical data. I use it all of the time. Personally I store my
database definitions in YAML, a hierarchical data format that is human
readable/writable (unlike XML) as well as machine readable-writable.
My programs return hierarchical data from AJAX requests as JSON, because
that's what the browser works best with, and all of my PHP programs
handle all data universally as associative arrays, which are just
hierarchical data in yet another disguise. I love hierarchies, but have
not use for a format that is not human readable/writable, which is
incredibly verbose, and which

So when I say you are arguing from ignorance, I am saying that you are
generalizing your own experience with heavy-duty text management, and
since you have never mentioned any of the topics above, you may not have
the entire picture.

Now, to your point about my own limited experience, I picked a path some
years ago that has made me an expert in some areas and ignorant in
others. But I don't go claiming that "80% of the worlds applications
cannot use RDBMS". In fact, the examples you raise are all examples of
text management. This is a new area that the RDBMS was never intended
to solve. Many people have found it easily possible to extend the RDBMS
in a few areas, but others (such as you) are saying we need to start
over. But it is amusing that the look-again crowd has started over with
hierarchical data. In the end it won't be the format that is used, but
the basic abilities to manage and store text. I submit that the clear
solution has yet to emerge from that pursuit.
Post by Elliotte Harold
Post by Kenneth Downs
You simply cannot defend a file format as a foundation for frameworks
and databases. The best you can do is defend the model, such as the
hierarchical model.
XML is not a file format. We've been down this road before. A native
XML database is no more based ona file format than MySQL is based on
tab delimited text.
But you are not saying what it is based upon. My statements above about
ACID compliance, security, and separation of implementation from
interface provide a basis for a database. The structure of the data is
given by tables. This makes a complete system.

If you cannot provide the basis for the entire picture of data
management, we are left with what the XML books tell me: how to format
the file.
Post by Elliotte Harold
Post by Kenneth Downs
Going further, you cannot defend a file format as a foundation for
anything based on how it handles large text (or binary) fields.
-> Data model, hierarchical vs. relational. -> File format, XML vs
YAML or JSON or any other format you like
-> Handling of large text (and binary) columns.
Finally, if we can all admit that XML is just a file format, then the
entire framework crumbles as soon as somebody comes up with a better
one, because let's admit it, XML is just about the worst you're going
to find.
Troll. Troll. Troll.
???? Geez Rusty, come on. My conclusion is worded harshly yes, but do
you really label as a troll a description of the larger issues of
formats, data models, and everything else that makes up the larger picture?
Post by Elliotte Harold
Post by Kenneth Downs
In conclusion, the examples you provide appear to give advantage to
XML because tools exist to handle data that has been buried in opaque
formats and poorly defined structures. If the data had been
structured properly in the first place and put into formats that were
not so opaque, using (pardon me for saying) a *real* database,
designed on solid principles, the examples you give become child's play.
LOL. Seriously, try storing a book or an encyclopedia in a relational
database with anything approximating 1NF, not even 2NF. Then try and
make it perform adequately.
Not all data fits neatly into tables.
Actually most data does not, not at first glance. But since a table is
simply a mapping of properties to entities, it turns out that most data
does when you look at it closely. It takes about the same effort as
deciding upon a set of tags, since it is of course exactly the same process.

The crucial question is, does your book have structure? Can you make up
tags as you go or are you limited to a pre-defined set, such as Docbook?
Once you commit to a specific set of tags, you have committed to a
structure, and you may as well use tables as anything else. Methinks
however that at this point it comes down to what you are comfortable
with. If you want to use XML, go for it, if you want to use tables, go
for it, just don't confuse the structure of the data with a fundamental
need for either system.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Elliotte Harold
2007-08-12 03:07:02 UTC
Permalink
Post by csnyder
If it's not too much trouble, could you give us some other use cases
for an XML database? Because title and first paragraph, if that's
something a system "routinely does" could easily be stored as
relational data at the time of import.
Storing books, web pages, and the like in a relational database has only
two basic approaches: make it a blob or cut it into tiny little pieces.
The first eliminates search capabilities; the second performs like a dog.

You're right that if grabbing the title and the first paragraph is all
you need to do, then these two pieces could be stored separately (and
normalization be damned.) Now suppose the editor comes along and tells
you they really want to show the first two paragraphs of text to
non-logged in users instead of just the first one. If you know the
queries in advance, you can layout the data to optimize them, but using
the data's own structure will give you much more flexibility for
unexpected uses.

Here's another common use case: extract the links from a web site. Do a
Google-like reverse index that finds all the pages linking to this one.
The only way to make that happen in a relational DB is to chop the
content into so many trivially small pieces that putting them back
together again is prohibitively expensive. And even once you've done
that, the SQL to pull the result is ungodly ugly. The XQuery is a lot
simpler because it matches the natural structure of the documents rather
than treating everything as a table. Some data wants to live in tables.
Some doesn't.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
csnyder
2007-08-13 17:03:55 UTC
Permalink
Post by Elliotte Harold
Here's another common use case: extract the links from a web site. Do a
Google-like reverse index that finds all the pages linking to this one.
The only way to make that happen in a relational DB is to chop the
content into so many trivially small pieces that putting them back
together again is prohibitively expensive. And even once you've done
that, the SQL to pull the result is ungodly ugly. The XQuery is a lot
simpler because it matches the natural structure of the documents rather
than treating everything as a table. Some data wants to live in tables.
Some doesn't.
Ah, now _that's_ a great example, and something that CMS developers
often need to do after the fact (as in link-checking, or generating a
graph of sites you link to for SEO purposes).

My first instinct would be to look for XPath support in my relational
db, and indeed MySQL does this:
http://dev.mysql.com/tech-resources/articles/mysql-5.1-xml.html

But if a native-XML database can do it better or much more efficiently
for large datasets, then it is certainly worth investigating.
--
Chris Snyder
http://chxo.com/
Jon Baer
2007-08-13 21:26:46 UTC
Permalink
Immediately following this conversation I stopped by B&N to pick up
this book:

http://www.amazon.com/o/ASIN/0596006349
XQuery by Priscilla Walmsley

It mainly goes over XPath 2.0 vs. 1.0 for most of the book, but
overall it is a *great* insight into the topic. Also very good
examples on FLWOR (http://en.wikipedia.org/wiki/FLWOR).

- Jon
Post by csnyder
My first instinct would be to look for XPath support in my relational
http://dev.mysql.com/tech-resources/articles/mysql-5.1-xml.html
But if a native-XML database can do it better or much more efficiently
for large datasets, then it is certainly worth investigating.
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070813/0a6a4d85/attachment.html>
Elliotte Harold
2007-08-08 14:12:46 UTC
Permalink
Post by Kenneth Downs
Not a chance. There is no such thing as a native XML database and there
never will be because XML is a file format (oops, data format), and an
extremely inefficient one at that. To have a native database you need a
data model. XML uses the hierarchical model and if you're going to
build a native hierarchical database you sure wouldn't use the wasteful
XML format internally to store the data.
And if you insist on using XML just because its so wonderful to use 17
characters to store the state <STATE>NY</STATE>, it will never be able
to compete with even the most immature relational engines for pure
speed. Maybe on example and toy sites, but never for anything that
needs to scale.
If I'm wrong, dinner's on me.
Which part do you have to be wrong about for me to get a free dinner? I
can already give you several existence proofs demonstrating that there
is such a thing as a native XML database:

http://www.marklogic.com/
http://exist.sourceforge.net/

More are coming.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Kenneth Downs
2007-08-08 14:41:32 UTC
Permalink
Post by Elliotte Harold
Post by Kenneth Downs
Not a chance. There is no such thing as a native XML database and
there never will be because XML is a file format (oops, data format),
and an extremely inefficient one at that. To have a native database
you need a data model. XML uses the hierarchical model and if you're
going to build a native hierarchical database you sure wouldn't use
the wasteful XML format internally to store the data.
And if you insist on using XML just because its so wonderful to use
17 characters to store the state <STATE>NY</STATE>, it will never be
able to compete with even the most immature relational engines for
pure speed. Maybe on example and toy sites, but never for anything
that needs to scale.
If I'm wrong, dinner's on me.
Which part do you have to be wrong about for me to get a free dinner?
I can already give you several existence proofs demonstrating that
http://www.marklogic.com/
http://exist.sourceforge.net/
A sourceforge project does not a phenomenon make. I guess when the
banks and airlines have our data in the XML files, and I don't mean a
few hybrid patchwork examples, I mean the hardcore permanent long-term
stuff as well as the transactional support for the reservations we're
making all day. When that happens, we'll start picking the restaurants.
Post by Elliotte Harold
More are coming.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Elliotte Harold
2007-08-08 15:11:01 UTC
Permalink
Post by Kenneth Downs
A sourceforge project does not a phenomenon make. I guess when the
banks and airlines have our data in the XML files, and I don't mean a
few hybrid patchwork examples, I mean the hardcore permanent long-term
stuff as well as the transactional support for the reservations we're
making all day. When that happens, we'll start picking the restaurants.
I've never said native XML databases will replace the old-school apps
like banking accounts that fit relational databases very well. What they
will do is enable new applications that simply cannot be built on top of
a relational database, applications like Safari (the book site, not the
web browser):

http://safari.oreilly.com/

There are others, mostly in publishing because that's the first area
where there's a large backlog of information and applications that
relational database vendors have been unable to serve. However other
companies will come online as they begin to realize they can manage all
their data with these tools, not just the the 20% of it that fits neatly
into rectangles. Expect to see more uptake in law firms, advertising,
media, education, government, and other sectors that have large numbers
of critical documents where order matters, duplication is a fact of
life, and normalization is not just a bad idea out outright impossible.

The relational model is a very powerful model, but it achieves its power
at the cost of restricting what it's possible to store. Other models are
needed for other applications.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Kenneth Downs
2007-08-08 20:43:53 UTC
Permalink
Post by Elliotte Harold
Post by Kenneth Downs
A sourceforge project does not a phenomenon make. I guess when the
banks and airlines have our data in the XML files, and I don't mean a
few hybrid patchwork examples, I mean the hardcore permanent
long-term stuff as well as the transactional support for the
reservations we're making all day. When that happens, we'll start
picking the restaurants.
I've never said native XML databases will replace the old-school apps
like banking accounts that fit relational databases very well. What
they will do is enable new applications that simply cannot be built on
top of a relational database, applications like Safari (the book site,
http://safari.oreilly.com/
There are others, mostly in publishing because that's the first area
where there's a large backlog of information and applications that
relational database vendors have been unable to serve. However other
companies will come online as they begin to realize they can manage
all their data with these tools, not just the the 20% of it that fits
neatly into rectangles. Expect to see more uptake in law firms,
advertising, media, education, government, and other sectors that have
large numbers of critical documents where order matters, duplication
is a fact of life, and normalization is not just a bad idea out
outright impossible.
The relational model is a very powerful model, but it achieves its
power at the cost of restricting what it's possible to store. Other
models are needed for other applications.
Two words: document management.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Edward Potter
2007-08-08 21:15:56 UTC
Permalink
I think the scary reality is that those banks and CC companies are
generally managing all their database needs with some abstracted
Cobol. It does the job. Everyone's afraid to touch it. And the folks
that built it died years ago. yipes!

:-) ed
Post by Kenneth Downs
Post by Elliotte Harold
Post by Kenneth Downs
A sourceforge project does not a phenomenon make. I guess when the
banks and airlines have our data in the XML files, and I don't mean a
few hybrid patchwork examples, I mean the hardcore permanent
long-term stuff as well as the transactional support for the
reservations we're making all day. When that happens, we'll start
picking the restaurants.
I've never said native XML databases will replace the old-school apps
like banking accounts that fit relational databases very well. What
they will do is enable new applications that simply cannot be built on
top of a relational database, applications like Safari (the book site,
http://safari.oreilly.com/
There are others, mostly in publishing because that's the first area
where there's a large backlog of information and applications that
relational database vendors have been unable to serve. However other
companies will come online as they begin to realize they can manage
all their data with these tools, not just the the 20% of it that fits
neatly into rectangles. Expect to see more uptake in law firms,
advertising, media, education, government, and other sectors that have
large numbers of critical documents where order matters, duplication
is a fact of life, and normalization is not just a bad idea out
outright impossible.
The relational model is a very powerful model, but it achieves its
power at the cost of restricting what it's possible to store. Other
models are needed for other applications.
Two words: document management.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
--
the Blog: http://www.utopiaparkway.com
the Karma: http://www.coderswithconscience.com
the Projects: http://flickr.com/photos/86842405 at N00/
the Store: http://astore.amazon.com/httpwwwutopic-20
Jeremy Mcentire
2007-08-08 15:40:41 UTC
Permalink
Post by Kenneth Downs
Post by Elliotte Rusty Harold
Of course, if you really want to rock, try combining XQuery+XQueryP
+APP+a native XML database. Once the tooling matures a bit, that's
a stack that's going to make all previous web dev frameworks look
like PowerBuilder.
Not a chance. There is no such thing as a native XML database and
there never will be because XML is a file format (oops, data
format), and an extremely inefficient one at that. To have a
native database you need a data model. XML uses the hierarchical
model and if you're going to build a native hierarchical database
you sure wouldn't use the wasteful XML format internally to store
the data.
And if you insist on using XML just because its so wonderful to use
17 characters to store the state <STATE>NY</STATE>, it will never
be able to compete with even the most immature relational engines
for pure speed. Maybe on example and toy sites, but never for
anything that needs to scale.
By that argument, the current RDBMS use 7 characters to store a state:
StateNY.

Of course, we'd use the same for an XML-based DBMS for storage with
logic that constructs:
<STATE>NY</STATE> upon request. Since, we all know that the state
code requires some extra characters so that the DBMS could organize
it. A file with no meta-data that said: NYMONJOKTXAZCA wouldn't make
much sense, would it?
Kenneth Downs
2007-08-08 20:41:21 UTC
Permalink
Post by Jeremy Mcentire
Post by Kenneth Downs
Post by Elliotte Rusty Harold
Of course, if you really want to rock, try combining
XQuery+XQueryP+APP+a native XML database. Once the tooling matures a
bit, that's a stack that's going to make all previous web dev
frameworks look like PowerBuilder.
Not a chance. There is no such thing as a native XML database and
there never will be because XML is a file format (oops, data format),
and an extremely inefficient one at that. To have a native database
you need a data model. XML uses the hierarchical model and if you're
going to build a native hierarchical database you sure wouldn't use
the wasteful XML format internally to store the data.
And if you insist on using XML just because its so wonderful to use
17 characters to store the state <STATE>NY</STATE>, it will never be
able to compete with even the most immature relational engines for
pure speed. Maybe on example and toy sites, but never for anything
that needs to scale.
StateNY.
Of course, we'd use the same for an XML-based DBMS for storage with
<STATE>NY</STATE> upon request. Since, we all know that the state
code requires some extra characters so that the DBMS could organize
it. A file with no meta-data that said: NYMONJOKTXAZCA wouldn't make
much sense, would it?
_______________________________________________
Jeremy I'm not sure if you're serious. You do know that the entire
response is, as we say, "not even wrong", it's so incorrect that it
can't be corrected.

OK, I fell for it, you got me....
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
csnyder
2007-08-07 14:23:10 UTC
Permalink
Post by Hans Zaunere
While it seems like a good idea, it'd probably cause more grief... think of
all those poor images, CSS, PDFs, JS, static HTML, etc. files out there that
we assume get served directly, correctly, statically - and quickly - right
from the filesystem.
But that's what caching proxies are for!

You're totally right, though, being able to mix and match gets a lot
more out of a single server instance.
--
Chris Snyder
http://chxo.com/
Michael Sims
2007-08-05 18:24:07 UTC
Permalink
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is probably
the simplest solution except for one thing: it carries a very strong
coupling between pages and scripts. As far as I've ever been able to
tell PHP really, really, really wants there to be a single primary .php
file for each URL that does not contain a query string (though that file
may of course invoke others).
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
You have to think of the whole file-serving hierarchy here. Apache gets a
request for an URL. What will Apache do with it? Apache will find a file
somewhere that matches that URL, and then Apache will either send that file
off for further processing (if the file is registered as such within
Apache) or Apache will just send that file off to the browser as-is. PHP
doesn't even enter into the picture until the decision you're talking about
has already been made by Apache.

I don't know much about java servlets, but I strongly suspect it's the
same - the Sun web server or whatever you're using is making that decision.
It may appear to be "simpler" than the PHP/Apache combination but it really
isn't. Perhaps it is better integrated because both the web server and the
programming language are products of one company, but it's not any simpler
when it executes.

In any case the correct answer is just to tell Apache to serve file X for
every URL that looks like Y or Z or W. Mod_rewrite. That's what it's
there for, and it does its job well, and it can be as simple as a couple
lines in an .htaccess file.

No doubt one could partly handle this with PHP files:

foo.com/index.php - lots of code
foo.com/dir1/index.php - PHP file sends everything to foo.com/index.php
foo.com/dir2/index.php - PHP file sends everything to foo.com/index.php
foo.com/dir1/dir3/index.php - PHP file sends everything to foo.com/index.php

but telling Apache to use foo.com/index.php for all requests is simpler and
less error-prone. It can be quite simple:

Contents of foo.com/yourdir/.htaccess file:
-------------------------------------------------------
RewriteEngine On
RewriteOptions inherit

RewriteBase /yourdir

RewriteRule ^([0-9A-Za-z]+)/([0-9A-Za-z]+)/ index.php?var1=$1&var2=$2
RewriteRule ^([0-9A-Za-z]+)/([0-9A-Za-z]+) index.php?var1=$1&var2=$2
RewriteRule ^([0-9A-Za-z]+)/ index.php?var1=$1
--------------------------------------------------------

Now, any request for foo.com/yourdir/anything/anythingatall will be sent to
foo.com/yourdir/index.php, which will see the extra "directories" as URL
variables. The user will not know what's happening - they'll continue to
see the "directories" in their browser status bar.


Michael Sims
Dell Sala
2007-08-05 18:26:41 UTC
Permalink
Post by Elliotte Harold
For the system I'm designing that simply won't work. In Java
servlet environments it's relatively trivial to map one servlet to
an entire directory structure, so that it handles all requests for
all pages within that hierarchy.
Is there any *reasonable* way to do this in PHP?
Here's an example of how to do it without using mod_rewrite. The
basic idea is to build a front-controller that parses the $_SERVER
['PATH_INFO'] variable, and delegates the request appropriately for
your application.

http://www.zend.com/zend/trick/tricks-apr-2003-urls.php

* gotcha: I've had problems getting this to work when php is
installed as a CGI.
Post by Elliotte Harold
The only way I've ever seen is what WordPress does: use mod_rewrite
[...] I am impressed by this hack, but it's way too kludgy for me
to be comfortable with.
I agree, I'm not fond of the mod_rewrite solution. However, I've
found that to be the only reliable method when php is running as a CGI.

-- Dell
Graham Hagger
2007-08-05 18:31:04 UTC
Permalink
We actually do this at my work. I don't have the details to hand, but
it basically involves setting the apache document root to be your actual
script, ie. index.php. That way no matter what url you request you will
always hit that page.

The script then examines the url that was requested to determine exactly
what page it should be rendered.

Sorry to not have details - but it is possible.

Graham
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is
probably the simplest solution except for one thing: it carries a very
strong coupling between pages and scripts. As far as I've ever been
able to tell PHP really, really, really wants there to be a single
primary .php file for each URL that does not contain a query string
(though that file may of course invoke others).
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
Is there any *reasonable* way to do this in PHP? The only way I've
ever seen is what WordPress does: use mod_rewrite to redirect all
requests within the hierarchy to a custom dispatcher script that
converts actual hierarchy components into query string variables. I am
impressed by this hack, but it's way too kludgy for me to be
comfortable with. For one thing, I don't want to depend on mod_rewrite
if I don't have to.
Surely by now there's a better way? How do I overcome the one file per
URL assumption that PHP makes?
Hans Zaunere
2007-08-05 18:42:26 UTC
Permalink
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is
probably the simplest solution except for one thing: it carries a
very strong coupling between pages and scripts. As far as I've ever
been able to tell PHP really, really, really wants there to be a
single primary .php file for each URL that does not contain a query
string (though that file may of course invoke others).
PHP doesn't actually care, but...
Post by Elliotte Harold
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
It has to do with the way PHP reaches into the request processing stack in
Apache (assuming Apache). Basically PHP doesn't reach as far up the request
stack as other things do, like mod_perl, Java, etc, which of course could be
argued as a good or bad thing.
Post by Elliotte Harold
Is there any *reasonable* way to do this in PHP? The only way I've
ever seen is what WordPress does: use mod_rewrite to redirect all
requests within the hierarchy to a custom dispatcher script that
converts actual hierarchy components into query string variables. I
am impressed by this hack, but it's way too kludgy for me to be
comfortable with. For one thing, I don't want to depend on
mod_rewrite if I don't have to.
A lot of people use mod_rewrite, but I never was a big fan either. However,
you can implement this "fuse-box" style processing quite elegantly in pure
Apache. There are a number of ways, most of which are covered in these
results:

http://www.google.com/search?q=mediawiki+url+rewrite

There are other options as well, including the ErrorDocument hack and
playing with ForceType, but I'm not much of a fan of those either. I find
the following to be the most elegant:

Alias /support/ "/var/www/www.something.com/support/"
Alias /Test/ "/var/www/www.something.com/Test/"
AliasMatch /(.*) "/var/www/www.something.com/index.php"

/support becomes a nice place to throw static stuff, like images, CSS, etc.

/Test can be used as a test bed, to test PHP scripts outside of the
fuse-box.

And then index.php is where the action is, getting called on every request.
Of course, the above can be adjusted as needed to have an unlimited number
of fuse-boxes at different URLs, etc. Combined with things like Apache's
AddType, the possibilities are endless. I think you can even use AddType
for a directory (or maybe it's ForceType).

I actually leave index.php empty, and use auto_prepend_file to call a PHP
file that handles the heavy lifting. This typically allows for better
delegation of responsbility and keeping PHP code outside of the
DocumentRoot.

And all of the above can be combined with various combinations of
<Directory> and <Location> directives in Apache, making it really flexible
and dizzying.

But, I generally keep it simple and use something like the above, and then
have a request processor in PHP do the URL mapping in a style akin to the
Java world. Straightforward, none of the PATH_INFO confusions, and
setup-and-forget.

---
Hans Zaunere / President / New York PHP
www.nyphp.org / www.nyphp.com
inforequest
2007-08-05 20:45:50 UTC
Permalink
Not everyone cares about search engines indexing unique URLs, but if you
do, you have to consider how the server response codes are generated for
various URLs. Aliasing content under different URLs is akin to asking
the search engines not to index it properly, and/or not to rank it
highly for relevant searches.

If you are using Apache, Micahel Sims has it correct. Apache is making
the decisions about URL dispatching, and since mod_rewrite is Apache's
extension for mapping URLs to dispatch, then that's the logical tool for
the job. As far as search engines go, you always need to be aware of
(and work around) how the default dispatching is handled (trailing
slashes, file not found, etc).

If you want cross-platform (server) scripts written in PHP, I'm not sure
you can ever break free of the web server completely and still stay in
complaince with search engine indexing best practices. Of course once
you are at that level of analysis, I think some of the PHP gurus can
highlight other problems you'll encounter that are more difficult than
how to properly manage the URL dispatch.

I think the Zend Framework is one case where they really really really
would like to avoid needing apache mod rewrite. As it has evolved, they
have added several rewrite routers to manage the same infrastructure
issues that mod-rwrite is typically used for. The best and only one I
like (for SEO purposes) is a full blown regex rewrite router, certainly
no "simpler" than mod rewrite.

-=john andrews
Post by Hans Zaunere
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is
probably the simplest solution except for one thing: it carries a
very strong coupling between pages and scripts. As far as I've ever
been able to tell PHP really, really, really wants there to be a
single primary .php file for each URL that does not contain a query
string (though that file may of course invoke others).
PHP doesn't actually care, but...
Post by Elliotte Harold
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
It has to do with the way PHP reaches into the request processing stack in
Apache (assuming Apache). Basically PHP doesn't reach as far up the request
stack as other things do, like mod_perl, Java, etc, which of course could be
argued as a good or bad thing.
Post by Elliotte Harold
Is there any *reasonable* way to do this in PHP? The only way I've
ever seen is what WordPress does: use mod_rewrite to redirect all
requests within the hierarchy to a custom dispatcher script that
converts actual hierarchy components into query string variables. I
am impressed by this hack, but it's way too kludgy for me to be
comfortable with. For one thing, I don't want to depend on
mod_rewrite if I don't have to.
A lot of people use mod_rewrite, but I never was a big fan either. However,
you can implement this "fuse-box" style processing quite elegantly in pure
Apache. There are a number of ways, most of which are covered in these
http://www.google.com/search?q=mediawiki+url+rewrite
There are other options as well, including the ErrorDocument hack and
playing with ForceType, but I'm not much of a fan of those either. I find
Alias /support/ "/var/www/www.something.com/support/"
Alias /Test/ "/var/www/www.something.com/Test/"
AliasMatch /(.*) "/var/www/www.something.com/index.php"
/support becomes a nice place to throw static stuff, like images, CSS, etc.
/Test can be used as a test bed, to test PHP scripts outside of the
fuse-box.
And then index.php is where the action is, getting called on every request.
Of course, the above can be adjusted as needed to have an unlimited number
of fuse-boxes at different URLs, etc. Combined with things like Apache's
AddType, the possibilities are endless. I think you can even use AddType
for a directory (or maybe it's ForceType).
I actually leave index.php empty, and use auto_prepend_file to call a PHP
file that handles the heavy lifting. This typically allows for better
delegation of responsbility and keeping PHP code outside of the
DocumentRoot.
And all of the above can be combined with various combinations of
<Directory> and <Location> directives in Apache, making it really flexible
and dizzying.
But, I generally keep it simple and use something like the above, and then
have a request processor in PHP do the URL mapping in a style akin to the
Java world. Straightforward, none of the PATH_INFO confusions, and
setup-and-forget.
--
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com
Rob Marscher
2007-08-06 14:50:10 UTC
Permalink
Post by Hans Zaunere
AliasMatch /(.*) "/var/www/www.something.com/index.php"
http://httpd.apache.org/docs/2.0/mod/mod_alias.html#aliasmatch

I wonder why the major php frameworks don't mention this as an
option? It seems from the documentation that it can't go
in .htaccess - so that may be why. The frameworks are assuming most
of their users are on shared servers and can't modify their httpd
config.

If there was an existing file... say favicon.ico... would AliasMatch
know to just serve that up instead of sending through index.php?
That's one thing I like about the following mod_rewrite rule - if the
file or directory exists, it won't pass it to the index.php front
controller:
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1

-Rob
Hans Zaunere
2007-08-14 19:08:24 UTC
Permalink
Post by Rob Marscher
Post by Hans Zaunere
AliasMatch /(.*) "/var/www/www.something.com/index.php"
http://httpd.apache.org/docs/2.0/mod/mod_alias.html#aliasmatch
I wonder why the major php frameworks don't mention this as an
option? It seems from the documentation that it can't go
in .htaccess - so that may be why. The frameworks are assuming most
of their users are on shared servers and can't modify their httpd
config.
Yeah, that's probably why. We only ever work in large deployments, which of
course have dedicated servers. .htaccess is thus evil, and slower. Neither
clients nor I ever intend to support free-hosting-of-the-month-club.com for
our applications.
Post by Rob Marscher
If there was an existing file... say favicon.ico... would AliasMatch
know to just serve that up instead of sending through index.php?
Yes - Apache is already aware of the extensions of files for which to
process through PHP. The /support/ alias in my example is really just for
convenience and to keep developers in line. Apache already knows how to
serve files correctly because of AddType (or even ForceType) and their
extension. Thus, a .php file will trigger the auto_prepend to fire, and
startup the application framework.
Post by Rob Marscher
That's one thing I like about the following mod_rewrite rule - if the
file or directory exists, it won't pass it to the index.php front
RewriteCond %{SCRIPT_FILENAME} !-f
RewriteCond %{SCRIPT_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1
So using a rewrite rule is redundant, and a hack in my opinion. This is the
fundamental difference between rewriting and aliasing.

---
Hans Zaunere / President / New York PHP
www.nyphp.org / www.nyphp.com
Jon Baer
2007-08-06 01:18:32 UTC
Permalink
I have to say that after spending a long time w/ Dynamo / Tomcat /
Struts and mod_rewrite that eventually I got down to learning the
routing mechanism of frameworks (MVC) and find it to be extremely
flexible and very well thought out + could easily replicate a servlet
URI request setup.

http://manual.cakephp.org/chapter/configuration
http://framework.zend.com/manual/en/
zend.controller.router.html#zend.controller.router.usage

- Jon
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is
probably the simplest solution except for one thing: it carries a
very strong coupling between pages and scripts. As far as I've ever
been able to tell PHP really, really, really wants there to be a
single primary .php file for each URL that does not contain a query
string (though that file may of course invoke others).
For the system I'm designing that simply won't work. In Java
servlet environments it's relatively trivial to map one servlet to
an entire directory structure, so that it handles all requests for
all pages within that hierarchy.
Is there any *reasonable* way to do this in PHP? The only way I've
ever seen is what WordPress does: use mod_rewrite to redirect all
requests within the hierarchy to a custom dispatcher script that
converts actual hierarchy components into query string variables. I
am impressed by this hack, but it's way too kludgy for me to be
comfortable with. For one thing, I don't want to depend on
mod_rewrite if I don't have to.
Surely by now there's a better way? How do I overcome the one file
per URL assumption that PHP makes?
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/
cafeaulaitA/
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070805/268503e5/attachment.html>
Hans Zaunere
2007-08-06 11:55:59 UTC
Permalink
Post by Jon Baer
I have to say that after spending a long time w/ Dynamo / Tomcat /
Struts and mod_rewrite that eventually I got down to learning the
routing mechanism of frameworks (MVC) and find it to be extremely
flexible and very well thought out + could easily replicate a servlet
URI request setup.
http://manual.cakephp.org/chapter/configuration
http://framework.zend.com/manual/en/zend.controller.router.html#zend.control
ler.router.usage


I think there is a distinction to make here, however.

There are two parts to the URL
routing/handling/SEO-friendly-URL-of-the-month/etc game:

1: getting the web server to relax it's by-default strict mapping of the URL
space to the filesystem, and then passing whatever the request was down to
the processing language

2: programming mechanisms in the processing language to intelligently deal
with whatever URL it gets from the web server

In PHP at least, these are two different things. At some point there was
interest in apache_hooks, but it unfortunately never matured.

In other technologies, #1 equates more to the processing language reaching
up and grabbing what it needs from the server, and then taking over. In
PHP, however, it's certainly a "top-down" approach.

For #1, most frameworks, and the two mentioned above, prefer the use
mod_rewrite. Then they each have their own mechanism to handle #2, with
some arguably being worse than others.

There's also fundamental differences in the way #1 can be handled.
Rewriting is a very different thing than aliasing a handler or set of URLs
to a single URL or processor.

The later is more elegant - and more flexible - avoiding mucking with obtuse
rewrite conditions and rules that require a testing cycle and often depend
on a fixed set of extensions to ignore.

---
Hans Zaunere / President / New York PHP
www.nyphp.org / www.nyphp.com
inforequest
2007-08-06 22:41:45 UTC
Permalink
Post by Hans Zaunere
Post by Jon Baer
I have to say that after spending a long time w/ Dynamo / Tomcat /
Struts and mod_rewrite that eventually I got down to learning the
routing mechanism of frameworks (MVC) and find it to be extremely
flexible and very well thought out + could easily replicate a servlet
URI request setup.
http://manual.cakephp.org/chapter/configuration
http://framework.zend.com/manual/en/zend.controller.router.html#zend.control
ler.router.usage
I think there is a distinction to make here, however.
There are two parts to the URL
1: getting the web server to relax it's by-default strict mapping of the URL
space to the filesystem, and then passing whatever the request was down to
the processing language
2: programming mechanisms in the processing language to intelligently deal
with whatever URL it gets from the web server
In PHP at least, these are two different things. At some point there was
interest in apache_hooks, but it unfortunately never matured.
In other technologies, #1 equates more to the processing language reaching
up and grabbing what it needs from the server, and then taking over. In
PHP, however, it's certainly a "top-down" approach.
For #1, most frameworks, and the two mentioned above, prefer the use
mod_rewrite. Then they each have their own mechanism to handle #2, with
some arguably being worse than others.
There's also fundamental differences in the way #1 can be handled.
Rewriting is a very different thing than aliasing a handler or set of URLs
to a single URL or processor.
The later is more elegant - and more flexible - avoiding mucking with obtuse
rewrite conditions and rules that require a testing cycle and often depend
on a fixed set of extensions to ignore.
---
Hans Zaunere / President / New York PHP
www.nyphp.org / www.nyphp.com
I follow you, Hans, but then what about URLs as resource locators? Your
elegant "aliasing a handler or set of URLs to a single URL or processor"
means URLs don't equate to (unique) information resources. Doesn't that
"break" the web?

-=john andrews
--
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com
Hans Zaunere
2007-08-07 12:58:39 UTC
Permalink
Post by inforequest
I follow you, Hans, but then what about URLs as resource locators?
Your elegant "aliasing a handler or set of URLs to a single URL or
processor" means URLs don't equate to (unique) information resources.
Doesn't that "break" the web?
The aliasing is happening within the web server to get around direct
filesystem/URL mapping - it's up to the business logic of the application to
serve different resources, which is the flexibility we're after. This
determination can happen in the application, dynamically and during request
time, rather than being dictated by the filesystem.

H
inforequest
2007-08-07 19:04:33 UTC
Permalink
Post by Hans Zaunere
Post by inforequest
I follow you, Hans, but then what about URLs as resource locators?
Your elegant "aliasing a handler or set of URLs to a single URL or
processor" means URLs don't equate to (unique) information resources.
Doesn't that "break" the web?
The aliasing is happening within the web server to get around direct
filesystem/URL mapping - it's up to the business logic of the application to
serve different resources, which is the flexibility we're after. This
determination can happen in the application, dynamically and during request
time, rather than being dictated by the filesystem.
H
okay so we are saying the same thing, as long as the app isn't neglectful
--
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com
Kenneth Downs
2007-08-06 12:45:06 UTC
Permalink
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is
probably the simplest solution except for one thing: it carries a very
strong coupling between pages and scripts.
This may be implied by examples, but it is simply not true. PHP, like
any other generalized language, allows you to easily create any kind of
structure for your code that you want and map it URLs and databases web
services any way you want.

A common pattern is called the "universal dispatcher", where one file,
typically index.php, accepts all queries and parses the request
parameters and dispatches the request to some other program.
Post by Elliotte Harold
As far as I've ever been able to tell PHP really, really, really wants
there to be a single primary .php file for each URL that does not
contain a query string (though that file may of course invoke others).
Not true. I think that if you go this impression, the rest of your
questions may not be valid, as they may take this false assumption as
true. What do you think?
Post by Elliotte Harold
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
I don't think I can parse this statement without knowing what is in
those "pages", why there is a servlet handling them, and what kind of
content they are, media? interactive database table maintenance?
Post by Elliotte Harold
Is there any *reasonable* way to do this in PHP?
Again, I'm not clear on what you are trying to serve. We probably have
to back up to the beginning and erase the assumption that PHP has a
one-to-one correspondence between a URL (or page) and a PHP file.
Having erased that, we have to ask what kind of content you are trying
to serve, then we have to look at PHP examples.

Then it would probably make sense to talk about whether to use
mod_rewrite, talk about how PHP do its own kind of mod_rewrite, how to
build libraries in PHP and so forth.

The hardest part is not to relate it to what you already know, such as
Java. That will really slow you down.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Elliotte Harold
2007-08-06 22:54:57 UTC
Permalink
Post by Kenneth Downs
Again, I'm not clear on what you are trying to serve. We probably have
to back up to the beginning and erase the assumption that PHP has a
one-to-one correspondence between a URL (or page) and a PHP file.
Having erased that, we have to ask what kind of content you are trying
to serve, then we have to look at PHP examples.
Here's a simple example: a news site backed by a database. URLs like

http://www.example.com/news/2007/07/05
http://www.example.com/news/2007/07/06
http://www.example.com/news/2007/07/07
http://www.example.com/news/2007/07/08
...

return pages which contain that day's headlines extracted from the
database.

One script, no more, must handle all dates. (I don't really care if
there are 2 or 3 scripts, but I do not want to have to write a separate
page for each URL. The number of PHP scripts must be finite and fixed.
It should not increase with the number of URLs the script services.)

The only way I've ever seen this done in PHP is by using mod_rewrite,
though they're a couple of other interesting suggestions in the thread I
need to explore further. Do you have a suggestion?
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
David Krings
2007-08-06 23:28:09 UTC
Permalink
Post by Elliotte Harold
Here's a simple example: a news site backed by a database. URLs like
http://www.example.com/news/2007/07/05
http://www.example.com/news/2007/07/06
http://www.example.com/news/2007/07/07
http://www.example.com/news/2007/07/08
...
return pages which contain that day's headlines extracted from the
database.
Are those dates consecutive? Means, is there a newspage for each day? If
yes, then this is trivial. Still trivial if they are excluding sundays
or such.
Post by Elliotte Harold
One script, no more, must handle all dates. (I don't really care if
there are 2 or 3 scripts, but I do not want to have to write a separate
page for each URL. The number of PHP scripts must be finite and fixed.
It should not increase with the number of URLs the script services.)
The only way I've ever seen this done in PHP is by using mod_rewrite,
though they're a couple of other interesting suggestions in the thread I
need to explore further. Do you have a suggestion?
Yes, a db table and header("Location: ") redirects that get stuffed
from what the table query returns, plus one redirect for the case the
required resource isn't found.

Unless I am entirely dense, this is what the other posts point to the
whole time.

David
inforequest
2007-08-06 23:28:54 UTC
Permalink
Elliotte Harold elharo-at-metalab.unc.edu |nyphp dev/internal group use|
Post by Elliotte Harold
Post by Kenneth Downs
Again, I'm not clear on what you are trying to serve. We probably
have to back up to the beginning and erase the assumption that PHP
has a one-to-one correspondence between a URL (or page) and a PHP
file. Having erased that, we have to ask what kind of content you
are trying to serve, then we have to look at PHP examples.
Here's a simple example: a news site backed by a database. URLs like
http://www.example.com/news/2007/07/05
http://www.example.com/news/2007/07/06
http://www.example.com/news/2007/07/07
http://www.example.com/news/2007/07/08
...
return pages which contain that day's headlines extracted from the
database.
One script, no more, must handle all dates. (I don't really care if
there are 2 or 3 scripts, but I do not want to have to write a
separate page for each URL. The number of PHP scripts must be finite
and fixed. It should not increase with the number of URLs the script
services.)
The only way I've ever seen this done in PHP is by using mod_rewrite,
though they're a couple of other interesting suggestions in the thread
I need to explore further. Do you have a suggestion?
Typical MVC front controller. A "news" script like /news/index.php gets
passed 2007 07 and 05 as params via the front controller, so one script
handles all news/year/month/day URLs. Most frameworks provide an MVC
front controller.

For me (a search engine optimizer) the core questions come later... how
does that MVC front controller handle "exceptions" like:

/news/2007 (missing params) -> should 301 to default URL like
/news/2007/01/01/ or throw a 404
/news/dfsrdf/07/06 (invalid year) -> 404 error
/news/ -> either 301 to default URL like /news/2007/01/01/ or throw a 404
/news -> 301 redirect to /news/ ? Too much redirection... either 301 to
default URL like /news/2007/01/01/ or throw a 404
/news/2007/07/07 (no trailing slash) -> 301 to trailing slash version
/news/2007/07/07/
/news/2007/06/07/whatever (excess params) -> 404

Admittedly, not many PHP coders care about such things. Some actually
think flexible dispatching is a feature! Imagine that! ;-)

-=john andrews
--
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com
Rob Marscher
2007-08-07 01:03:01 UTC
Permalink
Post by inforequest
For me (a search engine optimizer) the core questions come later...
/news/2007 (missing params) -> should 301 to default URL like /news/
2007/01/01/ or throw a 404
Unless you have a separate virtual resource that shows an archive of
news for 2007 and not the january 1 entries, right?
Post by inforequest
/news/2007/07/07 (no trailing slash) -> 301 to trailing slash
version /news/2007/07/07/
John, how important is this? I think a former discussion on the list
mentioned that most search engines are ok with the trailing slash vs.
no trailing slash going to the same resource.

Thanks a lot,
Rob
inforequest
2007-08-07 01:54:04 UTC
Permalink
Rob Marscher rmarscher-at-beaffinitive.com |nyphp dev/internal group
Post by Rob Marscher
Post by inforequest
For me (a search engine optimizer) the core questions come later...
/news/2007 (missing params) -> should 301 to default URL like /news/
2007/01/01/ or throw a 404
Unless you have a separate virtual resource that shows an archive of
news for 2007 and not the january 1 entries, right?
Post by inforequest
/news/2007/07/07 (no trailing slash) -> 301 to trailing slash
version /news/2007/07/07/
John, how important is this? I think a former discussion on the list
mentioned that most search engines are ok with the trailing slash vs.
no trailing slash going to the same resource.
Thanks a lot,
Rob
If you are consistent on your site, so you always use trailing slash or
not, Google says it will handle things properly.

But what if you have great indexing on /news/, and people linking to you
on /news/, but /news returns the same content? Now what if I link to
/news from a high power web site. Will Google decide to "value" the
/news over the /news/ version? Will it drop the "duplicate" at /news/
even though it has your backlinks?

Another consideration - what if half the world links to your /news/ and
the other half to /news. Is your incoming "link juice" being split
across the two unique URLs? Would you benefit more if the /news was a
301 redirect to /news/ (where a 301 redirect is followed by Google,
including the flow of link juice?)

I believe it is best to leave these matters up to the webmaster, not the
webmaster's competitors nor the Google engineers.

--=john
--
-------------------------------------------------------------
Your web server traffic log file is the most important source of web business information available. Do you know where your logs are right now? Do you know who else has access to your log files? When they were last archived? Where those archives are? --John Andrews Competitive Webmaster and SEO Blogging at http://www.johnon.com
Kenneth Downs
2007-08-07 00:27:48 UTC
Permalink
Post by Elliotte Harold
Post by Kenneth Downs
Again, I'm not clear on what you are trying to serve. We probably
have to back up to the beginning and erase the assumption that PHP
has a one-to-one correspondence between a URL (or page) and a PHP
file. Having erased that, we have to ask what kind of content you
are trying to serve, then we have to look at PHP examples.
Here's a simple example: a news site backed by a database. URLs like
http://www.example.com/news/2007/07/05
http://www.example.com/news/2007/07/06
http://www.example.com/news/2007/07/07
http://www.example.com/news/2007/07/08
...
return pages which contain that day's headlines extracted from the
database.
One script, no more, must handle all dates. (I don't really care if
there are 2 or 3 scripts, but I do not want to have to write a
separate page for each URL. The number of PHP scripts must be finite
and fixed. It should not increase with the number of URLs the script
services.)
The only way I've ever seen this done in PHP is by using mod_rewrite,
though they're a couple of other interesting suggestions in the thread
I need to explore further. Do you have a suggestion?
The way I actually did it in Andromeda was to use an .htaccess (though
you could put in Apache's config of course) with these lines given to me
by somebody on this list:

<FilesMatch "^news$">
ForceType application/x-httpd-php
</FilesMatch>


Now a file named "news" (a php file w/o the extension) is your universal
dispatcher. The "news" file splits the query string on slash and
determines which row to pull from the database.

The "news" file can also make sure subscribers are paid up, stuff like
that. In fact, it can load your entire framework and do anything you want.
--
Kenneth Downs
Secure Data Software, Inc.
www.secdat.com www.andromeda-project.org
631-689-7200 Fax: 631-689-0527
cell: 631-379-0010
Jon Baer
2007-08-07 01:45:55 UTC
Permalink
Bare minimum w/ CakePHP (4 files/scripts: model, controller, view,
route) + provided you have a db with a table called "articles":

1) /models/article.php
public class Article extends AppModel {}

2) /controllers/news_controller.php
public class NewsController extends AppController {
function index($year, $month, $day) {
$created = $year . "-" . $month . "-" . $day;
$this->set('articles', $this->Article->findAllByCreated($date);
}
}

3) /config/routes.php
$Route->connect ('/news', array('controller'=>'News',
'action'=>'index'));

4) /app/views/news/index.php
<?php foreach($articles as $article): ?>
<?= $article['Article']['title'] ?>
<?php endforeach; ?>

mod_rewrite will push everything to index.php which will call a
"Dispatcher" to handle the URL + follow what has been declared in the
routes file. I believe this would be almost the exact mechanism
found in items like DispatchServlet in Spring, etc. (although I
think you can abstract more items).

- Jon
Post by Elliotte Harold
Post by Kenneth Downs
Again, I'm not clear on what you are trying to serve. We probably
have to back up to the beginning and erase the assumption that PHP
has a one-to-one correspondence between a URL (or page) and a PHP
file. Having erased that, we have to ask what kind of content you
are trying to serve, then we have to look at PHP examples.
Here's a simple example: a news site backed by a database. URLs like
http://www.example.com/news/2007/07/05
http://www.example.com/news/2007/07/06
http://www.example.com/news/2007/07/07
http://www.example.com/news/2007/07/08
...
return pages which contain that day's headlines extracted from the
database.
One script, no more, must handle all dates. (I don't really care if
there are 2 or 3 scripts, but I do not want to have to write a
separate page for each URL. The number of PHP scripts must be
finite and fixed. It should not increase with the number of URLs
the script services.)
The only way I've ever seen this done in PHP is by using
mod_rewrite, though they're a couple of other interesting
suggestions in the thread I need to explore further. Do you have a
suggestion?
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/
cafeaulaitA/
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com
Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070806/6aaa658a/attachment.html>
csnyder
2007-08-06 19:07:51 UTC
Permalink
Post by Elliotte Harold
Surely by now there's a better way? How do I overcome the one file per
URL assumption that PHP makes?
I've tried at least four different ways around this over the years,
and I use mod_rewrite for everything but trivial apps.

Here are the ways I'm aware of:

1) Use ErrorDocument directive to pass 404s to a php script
http://httpd.apache.org/docs/2.0/mod/core.html#errordocument

2) Use Action and AddHandler directives
http://httpd.apache.org/docs/2.0/mod/mod_actions.html#action

3) Use Multiviews option so that
/index.php/path/to/some/virtual/resource resolves to index.php
http://httpd.apache.org/docs/2.0/content-negotiation.html#multiviews

4) Use mod_rewrite to send all requests to the same script

The only one I _wouldn't ever_ try again is the ErrorDocument method,
because the $_SERVER environment that you end up with (and which is
crucial to determining what was actually requested) is not what you'd
expect.

Mutliviews is always tricky for me, for some reason. I have a hard
time getting it to work with my base server config. Also, I don't like
to have the script name in every request.

Rewrite has proven to be the most flexible and (long-term) easiest
solution for me. But the Action/Handler method doesn't require any
extra extensions. I see it used by other projects, notably Trac.
--
Chris Snyder
http://chxo.com/
Robert Kim Wireless Internet Advisor
2007-08-08 00:07:18 UTC
Permalink
I had the same issue a while back. cc me on a good response willya?
Post by Elliotte Harold
I'm considering a simple site that I may design in PHP. PHP is probably
the simplest solution except for one thing: it carries a very strong
coupling between pages and scripts. As far as I've ever been able to
tell PHP really, really, really wants there to be a single primary .php
file for each URL that does not contain a query string (though that file
may of course invoke others).
For the system I'm designing that simply won't work. In Java servlet
environments it's relatively trivial to map one servlet to an entire
directory structure, so that it handles all requests for all pages
within that hierarchy.
Is there any *reasonable* way to do this in PHP? The only way I've ever
seen is what WordPress does: use mod_rewrite to redirect all requests
within the hierarchy to a custom dispatcher script that converts actual
hierarchy components into query string variables. I am impressed by this
hack, but it's way too kludgy for me to be comfortable with. For one
thing, I don't want to depend on mod_rewrite if I don't have to.
Surely by now there's a better way? How do I overcome the one file per
URL assumption that PHP makes?
--
Robert Q Kim, Wireless Internet Provider
http://evdo-coverage.com/satellite-wireless-internet.html
http://groups.google.com/group/unpaid-overtime
2611 S. Pacific Coast Highway 101
Suite 203 Unpaid Overtime Law Breaks
San Diego, CA 92007
206 984 0880
Jeremy Mcentire
2007-08-08 14:58:33 UTC
Permalink
I think XML is the cleanest method to abstract not only content from
design, but from code as well. If you take a look at this sample
site, you can see a simple implementation. I'm working up a generic
framework or CMS that will use this approach.

http://jmcentire.ath.cx:8080/

If you have any suggestions, I'm happy to hear them. But most of my
current websites are many pages, one script sorts of websites. I
don't like Smarty or other similar solutions; but, in using XSLT, I
can generate any sort of output I need -- like RSS or XHTML... or CSV.

Maybe you can implement a similar solution?
Daniel Krook
2007-08-13 16:48:30 UTC
Permalink
Steve,
Post by Elliotte Harold
Post by Elliotte Harold
Closer to home, think about a blogging system or a
content management
Post by Elliotte Harold
system. Now imagine what you could do if the page structure were
actually queryable, and not just an opaque blob in MySQLsomewhere.
This is a fascinating discussion. I can see how an NXD
might be a very
good fit for medical information systems where a logic
encounter record
might encompass dozens of normalized relational model
tables. What are
some of the stable, production-worthy open source servers out there?
Here's a press release put out today on a healthcare provider that appears
to be using XML in DB9 v9 for that sort of thing. Express-C is free (not
open source though) and comes with the pureXML feature.
http://www.ibm.com/press/us/en/pressrelease/22131.wss

DB2 9 w/ PHP info
http://www.ibm.com/software/data/db2/ad/php.html




Daniel Krook
Content Tools Developer - SCSA, SCJP, SCWCD, ZCE, ICDAssoc.
Global Solutions, ibm.com
Steve Manes
2007-08-14 15:22:05 UTC
Permalink
Post by Daniel Krook
Here's a press release put out today on a healthcare provider that appears
to be using XML in DB9 v9 for that sort of thing. Express-C is free (not
open source though) and comes with the pureXML feature.
http://www.ibm.com/press/us/en/pressrelease/22131.wss
The health care industry has embraced XML probably more than any other.
It's virtually impossible to build EHR/EMR software without XML
support and still be compliant. I believe eClinicalWorks stores its
data in XML too.
csnyder
2007-08-14 15:43:17 UTC
Permalink
Post by Steve Manes
Post by Daniel Krook
Here's a press release put out today on a healthcare provider that appears
to be using XML in DB9 v9 for that sort of thing. Express-C is free (not
open source though) and comes with the pureXML feature.
http://www.ibm.com/press/us/en/pressrelease/22131.wss
The health care industry has embraced XML probably more than any other.
It's virtually impossible to build EHR/EMR software without XML
support and still be compliant. I believe eClinicalWorks stores its
data in XML too.
Let's not forget MS embracing XML all over the place, too.

I just ran across this in a vCard, which means that directory software
and crm systems would also benefit from queryable XML support...
http://schemas.microsoft.com/office/outlook/12/electronicbusinesscards
--
Chris Snyder
http://chxo.com/
michael francis
2007-08-14 04:57:52 UTC
Permalink
Post by Josh McCormack
Do you have any recommended reading on XML CMS? Do you know of any
that are open source and in a useful state?
Several people asked me this so rather than responding individually I
just wrote up some thogughts and put them here:

http://cafe.elharo.com/xml/the-state-of-native-xml-databases/

The final word on this subject has not yet been written, but I think
this is a decent summary of what's available in the native XML DB space
as of August, 2007.

Roughly I think we're where SQL was in 1995: some good payware products
and some iffy but promising open source options. I expect the open
source options will improve into production worthy systems with time,
just as MySQL and PostgresQL did over the last decade.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
_______________________________________________
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

NYPHPCon 2006 Presentations Online
http://www.nyphpcon.com

Show Your Participation in New York PHP
http://www.nyphp.org/show_participation.php



---------------------------------
Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20070813/7cc9c97a/attachment.html>
Hans Zaunere
2007-08-14 14:45:45 UTC
Permalink
Post by Elliotte Harold
New York PHP Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
H
Continue reading on narkive:
Loading...